Robust and Efficient Fuzzy C-Means Clustering Constrained on Flexible Sparsity
This work addresses robustness and efficiency issues in clustering for data mining applications, but it appears incremental as it builds on existing K-Means methods with specific optimizations.
The paper tackles the problem of clustering with outliers and efficient optimization under L0-norm constraints by proposing REFCMFS, a novel algorithm that uses L2,1-norm robust loss and a ranking function to handle sparsity, resulting in improved performance on public datasets.
Clustering is an effective technique in data mining to group a set of objects in terms of some attributes. Among various clustering approaches, the family of K-Means algorithms gains popularity due to simplicity and efficiency. However, most of existing K-Means based clustering algorithms cannot deal with outliers well and are difficult to efficiently solve the problem embedded the $L_0$-norm constraint. To address the above issues and improve the performance of clustering significantly, we propose a novel clustering algorithm, named REFCMFS, which develops a $L_{2,1}$-norm robust loss as the data-driven item and imposes a $L_0$-norm constraint on the membership matrix to make the model more robust and sparse flexibly. In particular, REFCMFS designs a new way to simplify and solve the $L_0$-norm constraint without any approximate transformation by absorbing $\|\cdot\|_0$ into the objective function through a ranking function. These improvements not only make REFCMFS efficiently obtain more promising performance but also provide a new tractable and skillful optimization method to solve the problem embedded the $L_0$-norm constraint. Theoretical analyses and extensive experiments on several public datasets demonstrate the effectiveness and rationality of our proposed REFCMFS method.