Analysis of global k-means, an incremental heuristic for minimum sum-of-squares clustering

Pierre Hansen, Wai Ting Ngai, Bernard K. Cheung, Nenad Mladenovic

Research output: Journal article publicationJournal articleAcademic researchpeer-review

39 Citations (Scopus)

Abstract

The global k-means heuristic is a recently proposed (Likas, Vlassis and Verbeek, 2003) incremental approach for minimum sum-of-squares clustering of a set X of N points of Rdinto M clusters. For k = 2,3,..., M - 1 it considers the best-known set of k - 1 centroids previously obtained, adds a new cluster center at each point of X in turn and applies k-means to each set of k centroids so-obtained, keeping the best k-partition found. We show that global k-means cannot be guaranteed to find the optimum partition for any M ≥ 2 and d ≥ 1; moreover, the same holds for all M ≥ 3 if the new cluster center is chosen anywhere in Rdinstead of belonging to X. The empirical performance of global k-means is also evaluated by comparing the values it obtains with those obtained for three data sets with N ≤ 150 which are solved optimally, as well as with values obtained by the recent j-means heuristic and extensions thereof for three larger data sets with N ≤ 3038.
Original languageEnglish
Pages (from-to)287-310
Number of pages24
JournalJournal of Classification
Volume22
Issue number2
DOIs
Publication statusPublished - 1 Sep 2005

Keywords

  • Clustering
  • Global k-means
  • J-means
  • K-means
  • Minimum sum-of-squares

ASJC Scopus subject areas

  • Mathematics (miscellaneous)
  • Psychology (miscellaneous)
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences

Cite this