An effective evolutionary algorithm for discrete-valued data clustering

Patrick C.H. Ma, Chun Chung Chan, Xin Yao

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Clustering is concerned with the discovery of interesting groupings of records in a database. Of the many algorithms have been developed to tackle clustering problems in a variety of application domains, a lot of effort has been put into the development of effective algorithms for handling spatial data. These algorithms were originally developed to handle continuous-valued attributes, and the distance functions such as the Euclidean distance measure are often used to measure the pair-wise similarity/distance between records so as to determine the cluster memberships of records. Since such distance functions cannot be validly defined in non-Euclidean space, these algorithms therefore cannot be used to handle databases that contain discrete-valued data. Owing to the fact that data in the real-life databases are always described by a set of descriptive attributes, many of which are not numerical or inherently ordered in any way, it is important that a clustering algorithm should be developed to handle data mining tasks involving them. In this paper, we propose an effective evolutionary clustering algorithm for this problem. For performance evaluation, we have tested the proposed algorithm using several real data sets. Experimental results show that it outperforms the existing algorithms commonly used for discrete-valued data clustering, and also, when dealing with mixed continuous- and discrete-valued data, its performance is also promising.
Original languageEnglish
Title of host publication2008 IEEE Congress on Evolutionary Computation, CEC 2008
Pages210-216
Number of pages7
DOIs
Publication statusPublished - 14 Nov 2008
Event2008 IEEE Congress on Evolutionary Computation, CEC 2008 - Hong Kong, Hong Kong
Duration: 1 Jun 20086 Jun 2008

Conference

Conference2008 IEEE Congress on Evolutionary Computation, CEC 2008
Country/TerritoryHong Kong
CityHong Kong
Period1/06/086/06/08

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Theoretical Computer Science

Cite this