Cutoff for exact recovery of Gaussian mixture models
This solves a fundamental problem in clustering theory for statisticians and machine learning researchers, providing exact recovery guarantees.
The paper determines the information-theoretic cutoff value for exact recovery of cluster labels in Gaussian mixture models with equal cluster sizes, and shows that a semidefinite programming relaxation of K-means achieves this sharp threshold without symmetry assumptions.
We determine the information-theoretic cutoff value on separation of cluster centers for exact recovery of cluster labels in a $K$-component Gaussian mixture model with equal cluster sizes. Moreover, we show that a semidefinite programming (SDP) relaxation of the $K$-means clustering method achieves such sharp threshold for exact recovery without assuming the symmetry of cluster centers.