STMLFeb 27, 2022

Asymptotic Theory of Geometric and Adaptive $k$-Means Clustering

arXiv:2202.13423v25 citations
AI Analysis

This work provides a foundational theory for clustering in advanced geometric spaces, which is incremental as it builds on Pollard's classical result.

The paper tackles the problem of extending consistency results for k-means clustering to geometric settings and adaptive parameter selection, proving strong consistency and asymptotic limit theorems without requiring uniqueness of optimal cluster centers.

We revisit Pollard's classical result on consistency for $k$-means clustering in Euclidean space, with a focus on extensions in two directions: first, to problems where the data may come from interesting geometric settings (e.g., Riemannian manifolds, reflexive Banach spaces, or the Wasserstein space); second, to problems where some parameters are chosen adaptively from the data (e.g., $k$-medoids or elbow-method $k$-means). Towards this end, we provide a general theory which shows that all clustering procedures described above are strongly consistent. In fact, our method of proof allows us to derive many asymptotic limit theorems beyond strong consistency. We also remove all assumptions about uniqueness of the set of optimal cluster centers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes