MLLGDec 23, 2022

Stop using the elbow criterion for k-means and how to choose the number of clusters instead

arXiv:2212.12189v1157 citationsh-index: 31
Originality Synthesis-oriented
AI Analysis

This addresses a common problem in data analysis and education, but it is incremental as it reviews existing alternatives rather than introducing new methods.

The paper criticizes the elbow method for selecting the number of clusters in k-means, highlighting its lack of theoretical support and poor performance, and advocates for using better-known alternatives instead.

A major challenge when using k-means clustering often is how to choose the parameter k, the number of clusters. In this letter, we want to point out that it is very easy to draw poor conclusions from a common heuristic, the "elbow method". Better alternatives have been known in literature for a long time, and we want to draw attention to some of these easy to use options, that often perform better. This letter is a call to stop using the elbow method altogether, because it severely lacks theoretic support, and we want to encourage educators to discuss the problems of the method -- if introducing it in class at all -- and teach alternatives instead, while researchers and reviewers should reject conclusions drawn from the elbow method.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes