Building a decision cluster classification model for high dimensional data by a variable weighting k-means method

Yan Li, Edward Hung, Fu Lai Korris Chung, Joshua Huang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

14 Citations (Scopus)

Abstract

In this paper, a new classification method (ADCC) for high dimensional data is proposed. In this method, a decision cluster classification model (DCC) consists of a set of disjoint decision clusters, each labeled with a dominant class that determines the class of new objects falling in the cluster. A cluster tree is first generated from a training data set by recursively calling a variable weighting k-means algorithm. Then, the DCC model is selected from the tree. Anderson-Darling test is used to determine the stopping condition of the tree growing. A series of experiments on both synthetic and real data sets have shown that the new classification method (ADCC) performed better in accuracy and scalability than the existing methods of k-NN, decision tree and SVM. It is particularly suitable for large, high dimensional data with many classes.
Original languageEnglish
Title of host publicationAI 2008
Subtitle of host publicationAdvances in Artificial Intelligence - 21st Australasian Joint Conference on Artificial Intelligence, Proceedings
Pages337-347
Number of pages11
DOIs
Publication statusPublished - 1 Dec 2008
Event21st Australasian Joint Conference on Artificial Intelligence, AI 2008 - Auckland, New Zealand
Duration: 1 Dec 20085 Dec 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5360 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Australasian Joint Conference on Artificial Intelligence, AI 2008
CountryNew Zealand
CityAuckland
Period1/12/085/12/08

Keywords

  • Classification
  • Clustering
  • K-NN
  • W-k-means

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this