LGOct 16, 2016

Convergence rate of stochastic k-means

arXiv:1610.04900v23 citations
Originality Highly original
AI Analysis

This provides theoretical guarantees for large-scale clustering and unsupervised feature learning, addressing a gap in understanding the convergence behavior of popular stochastic k-means algorithms.

The paper tackles the problem of analyzing the convergence rate of stochastic k-means variants, showing for the first time that they achieve global convergence to local optima at an O(1/t) rate under general conditions, and with clusterable data and suitable initialization, mini-batch k-means converges to an optimal solution at the same rate with high probability.

We analyze online and mini-batch k-means variants. Both scale up the widely used Lloyd 's algorithm via stochastic approximation, and have become popular for large-scale clustering and unsupervised feature learning. We show, for the first time, that they have global convergence towards local optima at $O(\frac{1}{t})$ rate under general conditions. In addition, we show if the dataset is clusterable, with suitable initialization, mini-batch k-means converges to an optimal k-means solution with $O(\frac{1}{t})$ convergence rate with high probability. The k-means objective is non-convex and non-differentiable: we exploit ideas from non-convex gradient-based optimization by providing a novel characterization of the trajectory of k-means algorithm on its solution space, and circumvent its non-differentiability via geometric insights about k-means update.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes