Predicting subcellular localization of multi-location proteins by improving support vector machines with an adaptive-decision scheme

Shibiao Wan, Man Wai Mak

Research output: Journal article publicationJournal articleAcademic researchpeer-review

18 Citations (Scopus)

Abstract

From the perspective of machine learning, predicting subcellular localization of multi-location proteins is a multi-label classification problem. Conventional multi-label classifiers typically compare some pattern-matching scores with a fixed decision threshold to determine the number of subcellular locations in which a protein will reside. This simple strategy, however, may easily lead to over-prediction due to a large number of false positives. To address this problem, this paper proposes a more powerful multi-label predictor, namely AD–SVM, which incorporates an adaptive-decision (AD) scheme into multi-label support vector machine (SVM) classifiers. Specifically, given a query protein, a term-frequency based gene ontology vector is constructed by successively searching the gene ontology annotation database. Subsequently, the feature vector is classified by AD–SVM, which extends the binary relevance method with an adaptive decision scheme that essentially converts the linear SVMs to piecewise linear SVMs. Experimental results suggest that AD–SVM outperforms existing state-of-the-art multi-location predictors by at least 4 % (absolute) for a stringent virus dataset and 1 % (absolute) for a stringent plant dataset, respectively. Results also show that the adaptive-decision scheme can effectively reduce over-prediction while having insignificant effect on the correctly predicted ones.
Original languageEnglish
Pages (from-to)399-411
Number of pages13
JournalInternational Journal of Machine Learning and Cybernetics
Volume9
Issue number3
DOIs
Publication statusPublished - 1 Mar 2018

Keywords

  • Adaptive decisions
  • Multi-label classification
  • Protein subcellular localization
  • Support vector machines

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Predicting subcellular localization of multi-location proteins by improving support vector machines with an adaptive-decision scheme'. Together they form a unique fingerprint.

Cite this