Abstract
Given a set of clusters, we consider an optimization problem which seeks a subset of clusters that maximizes the microaverage F-measure. This optimal value can be used as an evaluation measure of the goodness of clustering. For arbitrarily overlapping clusters, finding the optimal value is NP-hard. We claim that a greedy approximation algorithm yields the global optimal solution for clusters that overlap only by nesting. We present a mathematical proof of this claim by induction. For a family of n clusters containing a total of N objects, this algorithm has an O(n2}) time complexity and O(N) space complexity.
Original language | English |
---|---|
Pages (from-to) | 2083-2087 |
Number of pages | 5 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 31 |
Issue number | 11 |
DOIs | |
Publication status | Published - 21 Aug 2009 |
Keywords
- Classification
- Clustering
- Optimization
- Performance evaluation
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Computational Theory and Mathematics
- Artificial Intelligence
- Applied Mathematics