Transductive Learning for Multi-Label Protein Subchloroplast Localization Prediction

Shibiao Wan, Man Wai Mak, Sun Yuan Kung

Research output: Journal article publicationJournal articleAcademic researchpeer-review

21 Citations (Scopus)

Abstract

Predicting the localization of chloroplast proteins at the sub-subcellular level is an essential yet challenging step to elucidate their functions. Most of the existing subchloroplast localization predictors are limited to predicting single-location proteins and ignore the multi-location chloroplast proteins. While recent studies have led to some multi-location chloroplast predictors, they usually perform poorly. This paper proposes an ensemble transductive learning method to tackle this multi-label classification problem. Specifically, given a protein in a dataset, its composition-based sequence information and profile-based evolutionary information are respectively extracted. These two kinds of features are respectively compared with those of other proteins in the dataset. The comparisons lead to two similarity vectors which are weighted-combined to constitute an ensemble feature vector. A transductive learning model based on the least squares and nearest neighbor algorithms is proposed to process the ensemble features. We refer to the resulting predictor as as EnTrans-Chlo. Experimental results on a stringent benchmark dataset and a novel dataset demonstrate that EnTrans-Chlo significantly outperforms state-of-the-art predictors and particularly gains more than 4 percent (absolute) improvement on the overall actual accuracy. For readers' convenience, EnTrans-Chlo is freely available online at http://bioinfo.eie.polyu.edu.hk/EnTransChloServer/.
Original languageEnglish
Article number7401011
Pages (from-to)212-224
Number of pages13
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume14
Issue number1
DOIs
Publication statusPublished - 1 Jan 2017

Keywords

  • ensemble transductive learning
  • multi-label classification
  • profile alignment
  • Protein subchloroplast localization prediction

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Cite this