Beyond redundancies: A metric-invariant method for unsupervised feature selection

Yuexian Hou, Peng Zhang, Tingxu Yan, Wenjie Li, Dawei Song

Research output: Journal article publicationJournal articleAcademic researchpeer-review

6 Citations (Scopus)

Abstract

A fundamental goal of unsupervised feature selection is denoising, which aims to identify and reduce noisy features that are not discriminative. Due to the lack of information about real classes, denoising is a challenging task. The noisy features can disturb the reasonable distance metric and result in unreasonable feature spaces, i.e., the feature spaces in which common clustering algorithms cannot effectively find real classes. To overcome the problem, we make a primary observation that the relevance of features is intrinsic and independent of any metric scaling on the feature space. This observation implies that feature selection should be invariant, at least to some extent, with respect to metric scaling. In this paper, we clarify the necessity of considering the metric invariance in unsupervised feature selection and propose a novel model incorporating metric invariance. Our proposed method is motivated by the following observations: if the statistic that guides the unsupervised feature selection process is invariant with respect to possible metric scaling, the solution of this model will also be invariant. Hence, if a metric-invariant model can distinguish discriminative features from noisy ones in a reasonable feature space, it will also work on the unreasonable counterpart transformed from the reasonable one by metric scaling. A theoretical justification of the metric invariance of our proposed model is given and the empirical evaluation demonstrates its promising performance.
Original languageEnglish
Article number4815245
Pages (from-to)348-364
Number of pages17
JournalIEEE Transactions on Knowledge and Data Engineering
Volume22
Issue number3
DOIs
Publication statusPublished - 1 Mar 2010

Keywords

  • Feature evaluation and selection
  • Information theory
  • Metric invariant

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this