An unsupervised attribute clustering algorithm for unsupervised feature selection

Pei Yuan Zhou, Chun Chung Chan

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

4 Citations (Scopus)

Abstract

The curse of dimensionality refers to the problem that one faces when analyzing datasets with thousands or hundreds of thousands of attributes. This problem is usually tackled by different feature selection methods which have been shown to effectively reduce computation time, improve prediction performance, and facilitate better understanding of datasets in various application areas. These methods can be classified into filter methods, wrapper methods and embedded methods. All of these feature selection methods require class label information to perform their tasks. Hence, when such information is unavailable, the feature selection problem can be very challenging. In order to overcome the above challenges, we propose an unsupervised feature selection method which is called Unsupervised Attribute Clustering Algorithm (UACA) involved in several steps: i) calculate the value of Maximal Information Coefficient for each pair of attributes to construct an attributes distance matrix; ii) cluster all attributes using optimal k-mode clustering method to find out k modes attributes as features of each cluster. For evaluating the performance of the proposed algorithm, classification problems with different classifiers were tested to validate the method and compare with other methods. The results of data experiments exhibit the proposed unsupervised algorithm which is comparable with classical feature selection methods and even outperforms some supervised learning algorithm.
Original languageEnglish
Title of host publicationProceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015
PublisherIEEE
ISBN (Electronic)9781467382731
DOIs
Publication statusPublished - 2 Dec 2015
EventIEEE International Conference on Data Science and Advanced Analytics, DSAA 2015 - Paris, France
Duration: 19 Oct 201521 Oct 2015

Conference

ConferenceIEEE International Conference on Data Science and Advanced Analytics, DSAA 2015
Country/TerritoryFrance
CityParis
Period19/10/1521/10/15

Keywords

  • mode
  • unsupervised attribute clustering
  • unsupervised feature selection

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems and Management
  • Information Systems

Cite this