TY - JOUR
T1 - A cluster-based intelligence ensemble learning method for classification problems
AU - Cui, Shaoze
AU - Wang, Yanzhang
AU - Yin, Yunqiang
AU - Cheng, T. C.E.
AU - Wang, Dujuan
AU - Zhai, Mingyu
N1 - Funding Information:
We thank the Editor, Associate Editor, and three anonymous referees for their helpful comments on earlier versions of our paper. This study was supported by the National Natural Science Foundation of China under grant numbers 71533001, 71974025, 71971041, and 71871148; and by the Outstanding Young Scientific and Technological Talents Foundation of Sichuan Province under grant number 2020JDJQ0035. Cheng was supported in part by The Hong Kong Polytechnic University under the Fung Yiu King-Wing Hang Bank Endowed Professorship in Business Administration. As this study was completed during the global COVID-19 epidemic, we thank all the health workers for their efforts in taking care of the patients and hope that the epidemic will end soonest.
Publisher Copyright:
© 2021 Elsevier Inc.
PY - 2021/6
Y1 - 2021/6
N2 - Classification is a vital task in machine learning. By learning patterns of samples of known categories, the model can develop the ability to distinguish the categories of samples of unknown categories. Noticing the advantages of the clustering method in cluster structure analysis, we combine the clustering and classification methods to develop the novel cluster-based intelligence ensemble learning (CIEL) method. We use the clustering method to analyze the inherent distribution of the data and divide all the samples into clusters according to the characteristics of the dataset. Then, for each specific cluster, we use different classification algorithms to establish the corresponding classification model. Finally, we integrate the prediction results of each base classifier to form the final prediction result. In view of the problem of parameter sensitivity, we use a swarm intelligence algorithm to optimize the key parameters involved in the clustering, classification, and ensemble stages in order to boost the classification performance. To assess the effectiveness of CIEL, we perform tenfold cross-validation experiments on the 24 benchmark datasets provided by UCI and KEEL. Designed to improve the performance of the classifiers, CIEL outperforms other popular machine learning methods such as naive Bayes, k-nearest neighbors, random forest, and support vector machine.
AB - Classification is a vital task in machine learning. By learning patterns of samples of known categories, the model can develop the ability to distinguish the categories of samples of unknown categories. Noticing the advantages of the clustering method in cluster structure analysis, we combine the clustering and classification methods to develop the novel cluster-based intelligence ensemble learning (CIEL) method. We use the clustering method to analyze the inherent distribution of the data and divide all the samples into clusters according to the characteristics of the dataset. Then, for each specific cluster, we use different classification algorithms to establish the corresponding classification model. Finally, we integrate the prediction results of each base classifier to form the final prediction result. In view of the problem of parameter sensitivity, we use a swarm intelligence algorithm to optimize the key parameters involved in the clustering, classification, and ensemble stages in order to boost the classification performance. To assess the effectiveness of CIEL, we perform tenfold cross-validation experiments on the 24 benchmark datasets provided by UCI and KEEL. Designed to improve the performance of the classifiers, CIEL outperforms other popular machine learning methods such as naive Bayes, k-nearest neighbors, random forest, and support vector machine.
KW - Classification algorithm
KW - Clustering algorithm
KW - Combination strategy
KW - Ensemble learning
KW - Swarm intelligence algorithm
UR - http://www.scopus.com/inward/record.url?scp=85101383389&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2021.01.061
DO - 10.1016/j.ins.2021.01.061
M3 - Journal article
AN - SCOPUS:85101383389
SN - 0020-0255
VL - 560
SP - 386
EP - 409
JO - Information Sciences
JF - Information Sciences
ER -