ML LG MEOct 2, 2020

Regularized K-means through hard-thresholding

arXiv:2010.00950v12.71 citations

Originality Incremental advance

AI Analysis

This work addresses clustering regularization for data analysis, but it is incremental as it builds on existing methods with specific penalization strategies.

The paper tackles the problem of regularizing K-means clustering by penalizing cluster center sizes, proposing HT K-means with an ℓ₀ penalty to induce sparsity, and shows it performs favorably in simulations and real data applications.

We study a framework of regularized $K$-means methods based on direct penalization of the size of the cluster centers. Different penalization strategies are considered and compared through simulation and theoretical analysis. Based on the results, we propose HT $K$-means, which uses an $\ell_0$ penalty to induce sparsity in the variables. Different techniques for selecting the tuning parameter are discussed and compared. The proposed method stacks up favorably with the most popular regularized $K$-means methods in an extensive simulation study. Finally, HT $K$-means is applied to several real data examples. Graphical displays are presented and used in these examples to gain more insight into the datasets.

View on arXiv PDF

Similar