Rare category exploration

H. Huang, K. Chiew, Y. Gao, Q. He, Qing Li

Research output: Journal article publicationJournal articleAcademic researchpeer-review

12 Citations (Scopus)

Abstract

Rare category discovery aims at identifying unlabeled data examples of rare categories in a given data set. The existing approaches to rare category discovery often need a certain number of labeled data examples as the training set, which are usually difficult and expensive to acquire in practice. To save the cost however, if these methods only use a small training set, their accuracy may not be satisfactory for real applications. In this paper, for the first time, we propose the concept of rare category exploration, aiming to discover all data examples of a rare category from a seed (which is a labeled data example of this rare category) instead of from a training set. To this end, we present an approach known as the FRANK algorithm which transforms rare category exploration to local community detection from a seed in a kNN (k-nearest neighbors) graph with an automatically selected k value. Extensive experimental results on real data sets verify the effectiveness and efficiency of our FRANK algorithm. © 2014 Elsevier Ltd. All rights reserved.
Original languageEnglish
Pages (from-to)4197-4210
Number of pages14
JournalExpert Systems with Applications
Volume41
Issue number9
DOIs
Publication statusPublished - 1 Jul 2014
Externally publishedYes

Keywords

  • Histogram density estimation
  • kNN graph
  • Local community
  • Rare category exploration

ASJC Scopus subject areas

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Cite this