GOASVM: Protein subcellular localization prediction based on Gene ontology annotation and SVM
- Resource Type
- Conference
- Authors
- Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan
- Source
- 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. :2229-2232 Mar, 2012
- Subject
- Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Proteins
Databases
Vectors
Support vector machines
Ontologies
Training
Amino acids
Protein subcellular localization
Gene Ontology Annotation
Gene Ontology
GO terms
- Language
- ISSN
- 1520-6149
2379-190X
Protein subcellular localization is an essential step to annotate proteins and to design drugs. This paper proposes a functional-domain based method—GOASVM—by making full use of Gene Ontology Annotation (GOA) database to predict the subcellular locations of proteins. GOASVM uses the accession number (AC) of a query protein and the accession numbers (ACs) of homologous proteins returned from PSI-BLAST as the query strings to search against the GOA database. The occurrences of a set of predefined GO terms are used to construct the GO vectors for classification by support vector machines (SVMs). The paper investigated two different approaches to constructing the GO vectors. Experimental results suggest that using the ACs of homologous proteins as the query strings can achieve an accuracy of 94.68%, which is significantly higher than all published results based on the same dataset. As a user-friendly web-server, GOASVM is freely accessible to the public at http://bioinfo.eie.polyu.edu.hk/mGoaSvmServer/GOASVM.html.