BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification

Guo Haixiang, Li Yijing, Li Yanan, Liu Xiao, Li Jinling

Research output: Journal article publicationJournal articleAcademic researchpeer-review

117 Citations (Scopus)

Abstract

This paper proposes an ensemble algorithm named of BPSO-Adaboost-KNN to cope with multi-class imbalanced data classification. The main idea of this algorithm is to integrate feature selection and boosting into ensemble. What's more, we utilize a novel evaluation metric called AUCarea which is especially for multi-class classification. In our model BPSO is employed as the feature selection algorithm in which AUCarea is chosen as the fitness. For classification, we generate a boosting classifier in which KNN is selected as the basic classifier. In order to verify the effectiveness of our method, 19 benchmarks are used in our experiments. The results show that the proposed algorithm improves both the stability and the accuracy of boosting after carrying out feature selection, and the performance of our algorithm is comparable with other state-of-the-art algorithms. In statistical analyses, we apply Bland-Altman analysis to show the consistencies between AUCarea and other popular metrics like average G-mean, average F-value etc. Besides, we use linear regression to find deeper correlation between AUCarea and other metrics in order to show why AUCarea works well in this issue. We also put out a series of statistical studies in order to analyze if there exist significant improvements after feature selection and boosting are employed. At last, the proposed algorithm is applied in oil-bearing of reservoir recognition. The classification precision is up to 99% in oilsk81-oilsk85 well logging data in Jianghan oilfield of China, which is 20% higher than KNN classifier. Particularly, the proposed algorithm has significant superiority when distinguishing the oil layer from other layers.

Original languageEnglish
Pages (from-to)176-193
Number of pages18
JournalEngineering Applications of Artificial Intelligence
Volume49
DOIs
Publication statusPublished - Mar 2016
Externally publishedYes

Keywords

  • Classification
  • Ensemble
  • Feature selection
  • Imbalanced data
  • Oil reservoir

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Cite this