LGMLFeb 19, 2019

Recovery of a mixture of Gaussians by sum-of-norms clustering

arXiv:1902.07137v119 citations
Originality Synthesis-oriented
AI Analysis

This provides a theoretical guarantee for clustering algorithms in machine learning, though it appears incremental as it extends prior work by removing a sample size limitation.

The paper tackles the problem of recovering a mixture of Gaussians using sum-of-norms clustering, showing that this method can achieve recovery even as the number of samples tends to infinity, lifting a previous restriction.

Sum-of-norms clustering is a method for assigning $n$ points in $\mathbb{R}^d$ to $K$ clusters, $1\le K\le n$, using convex optimization. Recently, Panahi et al.\ proved that sum-of-norms clustering is guaranteed to recover a mixture of Gaussians under the restriction that the number of samples is not too large. The purpose of this note is to lift this restriction, i.e., show that sum-of-norms clustering with equal weights can recover a mixture of Gaussians even as the number of samples tends to infinity. Our proof relies on an interesting characterization of clusters computed by sum-of-norms clustering that was developed inside a proof of the agglomeration conjecture by Chiquet et al. Because we believe this theorem has independent interest, we restate and reprove the Chiquet et al.\ result herein.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes