LGDSOct 27, 2020

Improved Guarantees for k-means++ and k-means++ Parallel

arXiv:2010.14487v128 citations
Originality Incremental advance
AI Analysis

This work provides better theoretical justification for widely used clustering algorithms, which is incremental but important for the machine learning and data science communities.

The paper tackles the k-means clustering problem by analyzing k-means++ and its parallel variant, showing improved approximation guarantees to explain their strong practical performance, and introduces a new variant called Exponential Race k-means++ with similar guarantees.

In this paper, we study k-means++ and k-means++ parallel, the two most popular algorithms for the classic k-means clustering problem. We provide novel analyses and show improved approximation and bi-criteria approximation guarantees for k-means++ and k-means++ parallel. Our results give a better theoretical justification for why these algorithms perform extremely well in practice. We also propose a new variant of k-means++ parallel algorithm (Exponential Race k-means++) that has the same approximation guarantees as k-means++.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes