MLDSITLGSTFeb 22, 2016

Clustering subgaussian mixtures by semidefinite programming

arXiv:1602.06612v2103 citations
Originality Incremental advance
AI Analysis

This work addresses clustering challenges in machine learning, but it appears incremental as it builds on existing SDP methods and focuses on specific mixture models.

The authors tackled the problem of clustering subgaussian mixtures by developing a model-free relax-and-round algorithm based on semidefinite programming, which provides performance guarantees and compares its approximation to the theoretically optimal k-means solution.

We introduce a model-free relax-and-round algorithm for k-means clustering based on a semidefinite relaxation due to Peng and Wei. The algorithm interprets the SDP output as a denoised version of the original data and then rounds this output to a hard clustering. We provide a generic method for proving performance guarantees for this algorithm, and we analyze the algorithm in the context of subgaussian mixture models. We also study the fundamental limits of estimating Gaussian centers by k-means clustering in order to compare our approximation guarantee to the theoretically optimal k-means clustering solution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes