TY - JOUR
T1 - Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data
AU - Yijing, Li
AU - Haixiang, Guo
AU - Xiao, Liu
AU - Yanan, Li
AU - Jinling, Li
N1 - Funding Information:
This research has been supported by National Natural Science Foundation of China under Grant nos. 71103163 , 71573237 ; New Century Excellent Talents in University of China under Grant no. NCET-13-1012 ; Research Foundation of Humanities and Social Sciences of Ministry of Education of China under Grant no. 15YJA630019 ; Special Funding for Basic Scientific Research of Chinese Central University under Grant nos. CUG120111, CUG110411, G2012002A, CUG140604 ; Open Foundation for the Research Center of Resource Environment Economics in China University of Geosciences (Wuhan) (Grant no. H2015004B ); Structure and Oil Resources Key Laboratory Open Project of China under Grant no. TPR-2011-11 .
Publisher Copyright:
© 2015 Elsevier B.V. All rights reserved.
PY - 2016/2/15
Y1 - 2016/2/15
N2 - Learning from imbalanced data, where the number of observations in one class is significantly rarer than in other classes, has gained considerable attention in the data mining community. Most existing literature focuses on binary imbalanced case while multi-class imbalanced learning is barely mentioned. What's more, most proposed algorithms treated all imbalanced data consistently and aimed to handle all imbalanced data with a versatile algorithm. In fact, the imbalanced data varies in their imbalanced ratio, dimension and the number of classes, the performances of classifiers for learning from different types of datasets are different. In this paper we propose an adaptive multiple classifier system named of AMCS to cope with multi-class imbalanced learning, which makes a distinction among different kinds of imbalanced data. The AMCS includes three components, which are, feature selection, resampling and ensemble learning. Each component of AMCS is selected discriminatively for different types of imbalanced data. We consider two feature selection methods, three resampling mechanisms, five base classifiers and five ensemble rules to construct a selection pool, the adapting criterion of choosing each component from the selection pool to frame AMCS is analyzed through empirical study. In order to verify the effectiveness of AMCS, we compare AMCS with several state-of-the-art algorithms, the results show that AMCS can outperform or be comparable with the others. At last, AMCS is applied in oil-bearing reservoir recognition. The results indicate that AMCS makes no mistake in recognizing characters of layers for oilsk81-oilsk85 well logging data which is collected in Jianghan oilfield of China.
AB - Learning from imbalanced data, where the number of observations in one class is significantly rarer than in other classes, has gained considerable attention in the data mining community. Most existing literature focuses on binary imbalanced case while multi-class imbalanced learning is barely mentioned. What's more, most proposed algorithms treated all imbalanced data consistently and aimed to handle all imbalanced data with a versatile algorithm. In fact, the imbalanced data varies in their imbalanced ratio, dimension and the number of classes, the performances of classifiers for learning from different types of datasets are different. In this paper we propose an adaptive multiple classifier system named of AMCS to cope with multi-class imbalanced learning, which makes a distinction among different kinds of imbalanced data. The AMCS includes three components, which are, feature selection, resampling and ensemble learning. Each component of AMCS is selected discriminatively for different types of imbalanced data. We consider two feature selection methods, three resampling mechanisms, five base classifiers and five ensemble rules to construct a selection pool, the adapting criterion of choosing each component from the selection pool to frame AMCS is analyzed through empirical study. In order to verify the effectiveness of AMCS, we compare AMCS with several state-of-the-art algorithms, the results show that AMCS can outperform or be comparable with the others. At last, AMCS is applied in oil-bearing reservoir recognition. The results indicate that AMCS makes no mistake in recognizing characters of layers for oilsk81-oilsk85 well logging data which is collected in Jianghan oilfield of China.
KW - Adaptive learning
KW - Imbalanced data
KW - Multiple classifier system
KW - Oil reservoir
UR - http://www.scopus.com/inward/record.url?scp=84953638515&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2015.11.013
DO - 10.1016/j.knosys.2015.11.013
M3 - Journal article
AN - SCOPUS:84953638515
SN - 0950-7051
VL - 94
SP - 88
EP - 104
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
ER -