Abstract
We propose a new optimal clustering effectiveness measure, called CS1, based on a combination of clusters rather than selecting a single optimal cluster as in the traditional MK1 measure. For hierarchical clustering, we present an algorithm to compute CS1, defined by seeking the optimal combinations of disjoint clusters obtained by cutting the hierarchical structure at a certain similarity level. By reformulating the optimization to a 0-1 linear fractional programming problem, we demonstrate that an exact solution can be obtained by a linear time algorithm. We further discuss how our approach can be generalized to more general problems involving overlapping clusters, and we show how optimal estimates can be obtained by greedy algorithms.
Original language | English |
---|---|
Pages (from-to) | 390-406 |
Number of pages | 17 |
Journal | Journal of the American Society for Information Science and Technology |
Volume | 59 |
Issue number | 3 |
DOIs | |
Publication status | Published - 1 Feb 2008 |
ASJC Scopus subject areas
- Software
- Information Systems
- Human-Computer Interaction
- Computer Networks and Communications
- Artificial Intelligence