TY - GEN
T1 - SOM2CE: Double self-organizing map based cluster ensemble framework and its application in cancer gene expression profiles
AU - Yu, Zhiwen
AU - Chen, Hantao
AU - You, Jia
AU - Li, Le
AU - Han, Guoqiang
PY - 2012/8/1
Y1 - 2012/8/1
N2 - Though there exist a lot of cluster ensemble approaches, few of them consider how to degrade the effect of noisy attributes in the dataset. In the paper, we propose a new cluster ensemble framework, named as double self-organizing map based cluster ensemble (SOM2CE) to perform clustering on noisy datasets. SOM2CE incorporates the self-organizing map (SOM) twice into the ensemble framework to discovery the underlying structure of noisy datasets, which applies SOM to perform clustering not only on the sample dimension, but also on the attribute dimension. SOM2CE also adopts the normalized cut algorithm to partition the consensus matrix constructed from multiple clustering solutions, and obtain the final results. Experiments on both synthetic datasets and cancer gene expression profiles illustrate that the proposed approach not only achieves good performance on synthetic datasets and cancer gene expression profiles, but also outperforms most of the existing approaches in the process of clustering gene expression profiles.
AB - Though there exist a lot of cluster ensemble approaches, few of them consider how to degrade the effect of noisy attributes in the dataset. In the paper, we propose a new cluster ensemble framework, named as double self-organizing map based cluster ensemble (SOM2CE) to perform clustering on noisy datasets. SOM2CE incorporates the self-organizing map (SOM) twice into the ensemble framework to discovery the underlying structure of noisy datasets, which applies SOM to perform clustering not only on the sample dimension, but also on the attribute dimension. SOM2CE also adopts the normalized cut algorithm to partition the consensus matrix constructed from multiple clustering solutions, and obtain the final results. Experiments on both synthetic datasets and cancer gene expression profiles illustrate that the proposed approach not only achieves good performance on synthetic datasets and cancer gene expression profiles, but also outperforms most of the existing approaches in the process of clustering gene expression profiles.
KW - cancer data
KW - Cluster ensemble
KW - self-organizing map
UR - http://www.scopus.com/inward/record.url?scp=84864336520&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-31087-4_37
DO - 10.1007/978-3-642-31087-4_37
M3 - Conference article published in proceeding or book
SN - 9783642310867
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 351
EP - 360
BT - Advanced Research in Applied Artificial Intelligence - 25th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2012, Proceedings
T2 - 25th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2012
Y2 - 9 June 2012 through 12 June 2012
ER -