LGJan 13, 2022

A Geometric Approach to $k$-means

arXiv:2201.04822v2
AI Analysis

This addresses the fundamental issue of local optima in k-means clustering for scientific and engineering domains, representing an incremental improvement over existing methods.

The paper tackles the nonconvex optimization problem in k-means clustering by proposing a geometric algorithmic framework to escape undesirable local solutions and recover the global or ground truth clustering, with theoretical justifications and extensive experiments demonstrating its efficacy.

\kmeans clustering is a fundamental problem in many scientific and engineering domains. The optimization problem associated with \kmeans clustering is nonconvex, for which standard algorithms are only guaranteed to find a local optimum. Leveraging the hidden structure of local solutions, we propose a general algorithmic framework for escaping undesirable local solutions and recovering the global solution or the ground truth clustering. This framework consists of iteratively alternating between two steps: (i) detect mis-specified clusters in a local solution, and (ii) improve the local solution by non-local operations. We discuss specific implementation of these steps, and elucidate how the proposed framework unifies many existing variants of \kmeans algorithms through a geometric perspective. We also present two natural variants of the proposed framework, where the initial number of clusters may be over- or under-specified. We provide theoretical justifications and extensive experiments to demonstrate the efficacy of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes