MLLGMEOct 2, 2020

Regularized K-means through hard-thresholding

arXiv:2010.00950v11 citations
Originality Incremental advance
AI Analysis

This work addresses clustering regularization for data analysis, but it is incremental as it builds on existing methods with specific penalization strategies.

The paper tackles the problem of regularizing K-means clustering by penalizing cluster center sizes, proposing HT K-means with an ℓ₀ penalty to induce sparsity, and shows it performs favorably in simulations and real data applications.

We study a framework of regularized $K$-means methods based on direct penalization of the size of the cluster centers. Different penalization strategies are considered and compared through simulation and theoretical analysis. Based on the results, we propose HT $K$-means, which uses an $\ell_0$ penalty to induce sparsity in the variables. Different techniques for selecting the tuning parameter are discussed and compared. The proposed method stacks up favorably with the most popular regularized $K$-means methods in an extensive simulation study. Finally, HT $K$-means is applied to several real data examples. Graphical displays are presented and used in these examples to gain more insight into the datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes