LGAIJul 29, 2024

Adaptive Self-supervised Robust Clustering for Unstructured Data with Unknown Cluster Number

arXiv:2407.20119v21 citationsh-index: 24
AI Analysis

This addresses the challenge of clustering unstructured data for applications where the number of clusters is unknown, though it appears incremental as it builds on existing techniques like graph auto-encoders and robust continuous clustering.

The paper tackles the problem of clustering unstructured data without prior knowledge of the number of clusters by introducing ASRC, which adaptively learns graph structures and uses a graph auto-encoder with contrastive learning, achieving superior performance over other models on seven benchmark datasets.

We introduce a novel self-supervised deep clustering approach tailored for unstructured data without requiring prior knowledge of the number of clusters, termed Adaptive Self-supervised Robust Clustering (ASRC). In particular, ASRC adaptively learns the graph structure and edge weights to capture both local and global structural information. The obtained graph enables us to learn clustering-friendly feature representations by an enhanced graph auto-encoder with contrastive learning technique. It further leverages the clustering results adaptively obtained by robust continuous clustering (RCC) to generate prototypes for negative sampling, which can further contribute to promoting consistency among positive pairs and enlarging the gap between positive and negative samples. ASRC obtains the final clustering results by applying RCC to the learned feature representations with their consistent graph structure and edge weights. Extensive experiments conducted on seven benchmark datasets demonstrate the efficacy of ASRC, demonstrating its superior performance over other popular clustering models. Notably, ASRC even outperforms methods that rely on prior knowledge of the number of clusters, highlighting its effectiveness in addressing the challenges of clustering unstructured data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes