A novel class noise estimation method and application in classification

Lin Gui, Qin Lu, Ruifeng Xu, Minglei Li, Qikang Wei

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

4 Citations (Scopus)

Abstract

Noise in class labels of any training set can lead to poor classification results no matter what machine learning method is used. In this paper, we first present the problem of binary classification in the presence of random noise on the class labels, which we call class noise. To model class noise, a class noise rate is normally defined as a small independent probability of the class labels being inverted on the whole set of training data. In this paper, we propose a method to estimate class noise rate at the level of individual samples in real data. Based on the estimation result, we propose two approaches to handle class noise. The first technique is based on modifying a given surrogate loss function. The second technique eliminates class noise by sampling. Furthermore, we prove that the optimal hypothesis on the noisy distribution can approximate the optimal hypothesis on the clean distribution using both approaches. Our methods achieve over 87% accuracy on a synthetic non-separable dataset even when 40% of the labels are inverted. Comparisons to other algorithms show that our methods outperform state-of-the-art approaches on several benchmark datasets in different domains with different noise rates.
Original languageEnglish
Title of host publicationCIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages1081-1090
Number of pages10
Volume19-23-Oct-2015
ISBN (Electronic)9781450337946
DOIs
Publication statusPublished - 17 Oct 2015
Event24th ACM International Conference on Information and Knowledge Management, CIKM 2015 - Melbourne, Australia
Duration: 19 Oct 201523 Oct 2015

Conference

Conference24th ACM International Conference on Information and Knowledge Management, CIKM 2015
Country/TerritoryAustralia
CityMelbourne
Period19/10/1523/10/15

Keywords

  • Class noise
  • Learning with noise
  • Noise elimination

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this