Abstract
In feature gene selection, filtering model concerns classification accuracy while ignoring gene redundancy problem. On the other hand, gene clustering finds correlated genes without considering their predictive abilities. It is valuable to enhance their performances by the help of each other. We report a new feature gene extraction algorithm, namely Double-thresholding Extraction of Feature Gene (DEFG), that combines gene filtering and gene clustering. It firstly pre-select feature gene set from the original dataset. A modified gene clustering is then applied to refine this set. In the gene clustering, specific designs are employed to balance the predictive abilities and the redundancies of the extracted feature gene. We have tested DEFG on a microarray dataset and compared its performance with that of two benchmark algorithms. The experimental results show that DEFG is superior to them in terms of internal validation accuracy and external validation accuracy. Also, DEFG can generalize the pattern structure by a small number of training samples.
Original language | English |
---|---|
Title of host publication | Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009 |
Pages | 197-202 |
Number of pages | 6 |
DOIs | |
Publication status | Published - 1 Dec 2009 |
Event | 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009 - Washington, DC, United States Duration: 1 Nov 2009 → 4 Nov 2009 |
Conference
Conference | 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009 |
---|---|
Country/Territory | United States |
City | Washington, DC |
Period | 1/11/09 → 4/11/09 |
Keywords
- Classification
- Clustering
- Extraction
- Feature gene
ASJC Scopus subject areas
- Biomedical Engineering
- Health Informatics
- Health Information Management