Abstract
As a fundamental step in various data analysis, exemplar-based clustering aims at clustering data by identifying representative samples as exemplars of the obtained groups. In this paper, a new fast exemplar-based clustering approach is proposed for a dataset with an arbitrary shape and number of clusters. The proposed approach begins with the reduced set of a dataset, which is a condensation of the dataset obtained by the well-developed kernel density estimators reduced set density estimator or fast reduced set density estimator, and then enters into its two advantageous stages: 1) fast exemplar finding (FEF) and 2) fast cluster assignment. The idea of the proposed approach has its basis in three assumptions: 1) exemplars should come from high-density samples; 2) exemplars should be either the components of the reduced set or their neighbors with high similarities; and 3) clusters can be diffused by surrounding both exemplars and its labeled reduced set. We theoretically analyze the proposed FEF from the perspective of the generalization performance of clustering and demonstrate the power of the proposed approach on several benchmarking datasets.
Original language | English |
---|---|
Journal | IEEE Transactions on Systems, Man, and Cybernetics: Systems |
DOIs | |
Publication status | Accepted/In press - 2 May 2017 |
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Human-Computer Interaction
- Computer Science Applications
- Electrical and Electronic Engineering