Abstract
Prediction of protein cleavage sites is an important step in drug design. Recent research has demonstrated that conditional random fields are capable of predicting the cleavage site locations of signal peptides, and their performance is comparable to that of SignalP-a state-of-the-art predictor based on hidden Markov models and neural networks. This paper investigates the degree of complementarity between CRF-based predictors and SignalP and proposes using the complementary properties to fuse the two predictors. It was found that about 40% of the sequences that are incorrectly predicted by SignalP can be correctly predicted by CRF, and that about 30% of the sequences that are incorrectly predicted by CRF can be correctly predicted by SignalP. This suggests that the two predictors complement each other. The paper also shows that the performance of CRF can be further improved by constructing the state features from spatially dispersed amino acids in the training sequences.
Original language | English |
---|---|
Title of host publication | APSIPA ASC 2009 - Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference |
Pages | 716-721 |
Number of pages | 6 |
Publication status | Published - 1 Dec 2009 |
Event | Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference, APSIPA ASC 2009 - Sapporo, Japan Duration: 4 Oct 2009 → 7 Oct 2009 |
Conference
Conference | Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference, APSIPA ASC 2009 |
---|---|
Country/Territory | Japan |
City | Sapporo |
Period | 4/10/09 → 7/10/09 |
Keywords
- Cleavage sites
- Conditional random fields
- Discriminative models
- Protein sequences
- Signal peptides
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Electrical and Electronic Engineering
- Communication