Abstract
Continuous relaxation of hard assignment clustering problems can lead to better solutions than
greedy iterative refinement algorithms. However, the validity of existing relaxations is contingent
on problem specific fuzzy parameters that quantify the level of similarity between the original combinatorial
problem and the relaxed continuous domain problem. Equivalence of solutions obtained
from the relaxation and the hard assignment is guaranteed only in the limit of vanishing ‘fuzzyness’.
This paper derives a new exact relaxation without such a fuzzy parameter which is applicable for
a wide range of clustering problems such as the K-means objective and pairwise clustering as well
as graph partition problems, e.g., for community detection in complex networks. In particular we
show that a relaxation to the simplex can be given for which the extreme solutions are stable hard
assignment solutions and vice versa. Based on the new relaxation we derive the SR-clustering
algorithm that has the same complexity as traditional greedy iterative refinement algorithms but
leading to significantly better partitions of the data. A Matlab implementation of the SR-clustering
algorithm is available for download.
Original language | English |
---|---|
Journal | Journal of Machine Learning Research |
ISSN | 1532-4435 |
Publication status | Submitted - 2009 |
Keywords
- Simplicial Relaxation
- Complex Networks
- Clustering
- sr-clustering
- Community Detection