A boosting based ensemble learning algorithm in imbalanced data classification

Yijing Li, Haixiang Guo, Yanan Li, Xiao Liu

Research output: Journal article publicationJournal articleAcademic researchpeer-review

17 Citations (Scopus)

Abstract

This paper focused on multi-class imbalanced data classification, proposed a BPSO-Adaboost-KNN ensemble learning algorithm based on feature selection and ensemble learning. What's more, the algorithm used a visual AUCarea metric to evaluate the performance of classifier when dealing with multiclass classification problems. Then the paper used 10 groups of UCI and KEEL data sets to test the proposed algorithm. The results show that the proposed algorithm improves the stability of the Adaboost after extract the key features, and the classification accuracy for ten groups of data are 20%~40% higher than the KNN classifier. When comparing BPSO-Adaboost-KNN with other three state-of-the-art ensemble algorithms, BPSO-Adaboost-KNN can obtain equal or better results. At last, the proposed algorithm is used in oil-bearing of reservoir recognition, three key attributes are selected (acoustic wave, porosity and oil saturation) successfully. The classification precision reaches more than 98% in oilsk81~oilsk85 Jianghan well logging data, which is 20% higher than KNN classifier. Particularly, the proposed algorithm has significant superiority when distinguishing the oil layer from other oil layers.

Original languageEnglish
Pages (from-to)189-199
Number of pages11
JournalXitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice
Volume36
Issue number1
DOIs
Publication statusPublished - 25 Jan 2016
Externally publishedYes

Keywords

  • Classification
  • Feature selection
  • Imbalanced data
  • Oil reservoir

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modelling and Simulation
  • Economic Geology
  • Computer Science Applications

Cite this