An effective data mining technique for the multi-class protein sequence classification

Patrick C H Ma, Chun Chung Chan

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

One way to understand the molecular mechanism of a cell is to understand the function of each protein encoded in its genome. The function of a protein is largely dependent on the three-dimensional structure the protein assumes after folding. Since the determination of three-dimensional structure experimentally is difficult and expensive, an easier and cheaper approach is for one to look at the primary sequence of a protein and to determine its function by classifying the sequence into the corresponding functional family. In this paper, we propose an effective data mining technique for the multi-class protein sequence classification. For experimentations, the proposed technique has been tested with different sets of protein sequences. Experimental results show that it outperforms other existing protein sequence classifiers and can effectively classify proteins into their corresponding functional families.
Original languageEnglish
Title of host publication2nd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2008
PublisherIEEE Computer Society
Pages486-489
Number of pages4
ISBN (Print)9781424417483
DOIs
Publication statusPublished - 1 Jan 2008
Event2nd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2008 - Shanghai, China
Duration: 16 May 200818 May 2008

Conference

Conference2nd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2008
Country/TerritoryChina
CityShanghai
Period16/05/0818/05/08

Keywords

  • Bioinformatics
  • Data mining
  • Protein sequence classification

ASJC Scopus subject areas

  • Biotechnology
  • Biomedical Engineering

Cite this