LGCVFeb 1, 2024

Deep Clustering Using the Soft Silhouette Score: Towards Compact and Well-Separated Clusters

arXiv:2402.00608v125 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing clustering performance in unsupervised learning for big data applications, but it is incremental as it builds on existing deep clustering methods.

The authors tackled the problem of improving deep clustering by proposing a probabilistic formulation of the silhouette coefficient called soft silhouette, which guides representations to form compact and well-separated clusters, and they reported very satisfactory clustering results on benchmark datasets.

Unsupervised learning has gained prominence in the big data era, offering a means to extract valuable insights from unlabeled datasets. Deep clustering has emerged as an important unsupervised category, aiming to exploit the non-linear mapping capabilities of neural networks in order to enhance clustering performance. The majority of deep clustering literature focuses on minimizing the inner-cluster variability in some embedded space while keeping the learned representation consistent with the original high-dimensional dataset. In this work, we propose soft silhoutte, a probabilistic formulation of the silhouette coefficient. Soft silhouette rewards compact and distinctly separated clustering solutions like the conventional silhouette coefficient. When optimized within a deep clustering framework, soft silhouette guides the learned representations towards forming compact and well-separated clusters. In addition, we introduce an autoencoder-based deep learning architecture that is suitable for optimizing the soft silhouette objective function. The proposed deep clustering method has been tested and compared with several well-studied deep clustering methods on various benchmark datasets, yielding very satisfactory clustering results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes