LGAINov 1, 2024

Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering

arXiv:2411.00904v15 citationsh-index: 3Has CodeIEEE Trans Knowl Data Eng
Originality Incremental advance
AI Analysis

This work addresses ensemble clustering for data analysis, offering incremental improvements by better utilizing cluster quality and dissimilarity information.

The paper tackled the problem of ensemble clustering by proposing a method that incorporates both similarity and dissimilarity information from base clusterings, resulting in improved accuracy and robustness compared to 13 state-of-the-art methods across 12 datasets.

Ensemble clustering aggregates multiple weak clusterings to achieve a more accurate and robust consensus result. The Co-Association matrix (CA matrix) based method is the mainstream ensemble clustering approach that constructs the similarity relationships between sample pairs according the weak clustering partitions to generate the final clustering result. However, the existing methods neglect that the quality of cluster is related to its size, i.e., a cluster with smaller size tends to higher accuracy. Moreover, they also do not consider the valuable dissimilarity information in the base clusterings which can reflect the varying importance of sample pairs that are completely disconnected. To this end, we propose the Similarity and Dissimilarity Guided Co-association matrix (SDGCA) to achieve ensemble clustering. First, we introduce normalized ensemble entropy to estimate the quality of each cluster, and construct a similarity matrix based on this estimation. Then, we employ the random walk to explore high-order proximity of base clusterings to construct a dissimilarity matrix. Finally, the adversarial relationship between the similarity matrix and the dissimilarity matrix is utilized to construct a promoted CA matrix for ensemble clustering. We compared our method with 13 state-of-the-art methods across 12 datasets, and the results demonstrated the superiority clustering ability and robustness of the proposed approach. The code is available at https://github.com/xuz2019/SDGCA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes