DSCGLGJan 29, 2017

On the Local Structure of Stable Clustering Instances

arXiv:1701.08423v352 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding why local search heuristics succeed in clustering applications, providing theoretical insights for researchers in algorithms and machine learning, though it is incremental as it builds on existing notions of structured data.

The paper tackles the problem of analyzing clustering algorithms like k-median and k-means under structured data conditions, proving that local optima are close to global optima in such cases, which leads to strong performance guarantees for the Local Search algorithm in recovering optimal clusterings and achieving small cost.

We study the classic $k$-median and $k$-means clustering objectives in the beyond-worst-case scenario. We consider three well-studied notions of structured data that aim at characterizing real-world inputs: Distribution Stability (introduced by Awasthi, Blum, and Sheffet, FOCS 2010), Spectral Separability (introduced by Kumar and Kannan, FOCS 2010), Perturbation Resilience (introduced by Bilu and Linial, ICS 2010). We prove structural results showing that inputs satisfying at least one of the conditions are inherently "local". Namely, for any such input, any local optimum is close both in term of structure and in term of objective value to the global optima. As a corollary we obtain that the widely-used Local Search algorithm has strong performance guarantees for both the tasks of recovering the underlying optimal clustering and obtaining a clustering of small cost. This is a significant step toward understanding the success of local search heuristics in clustering applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes