Speeding up subcellular localization by extracting informative regions of protein sequences for profile alignment

Wei Wang, Man Wai Mak, Sun Yuan Kung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

5 Citations (Scopus)

Abstract

The functions of proteins are closely related to their subcellular locations. In the post-proteomics era, the amount of gene and protein data grows exponentially, which necessitates the prediction of subcellular localization by computational means. This paper proposes mitigating the computation burden of alignment-based approaches to subcellular localization prediction by using the information provided by the N-terminal sorting signals. To this end, a cascaded fusion of cleavage site prediction and profile alignment is proposed. Specifically, the informative segments of protein sequences are identified by a cleavage site predictor. Then, only the informative segments are applied to a homology-based classifier for predicting the subcellular locations. Experimental results on a newly constructed dataset show that the method can make use of the best property of both approaches and can attain an accuracy higher than using the full-length sequences. Moreover, the method can reduce the computation time by 20 folds. We advocate that the method will be important for biologists to conduct large-scale protein annotation or for bioinformaticians to perform preliminary investigations on new algorithms that involve pairwise alignments.
Original languageEnglish
Title of host publication2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2010
Pages147-154
Number of pages8
DOIs
Publication statusPublished - 20 Aug 2010
Event2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2010 - Montreal, QC, Canada
Duration: 2 May 20105 May 2010

Conference

Conference2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2010
CountryCanada
CityMontreal, QC
Period2/05/105/05/10

Keywords

  • Cleavage sites prediction
  • Profiles alignment
  • Protein sequences
  • Subcellular localization
  • Support vector machines

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Biomedical Engineering

Cite this