LGAIMLApr 25, 2016

Weighted Spectral Cluster Ensemble

arXiv:1604.07178v119 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in clustering ensemble methods for data analysis, offering a robust solution to improve clustering accuracy.

The paper tackles the sensitivity of Cluster Ensemble Selection (CES) to its diversity metric and thresholding procedure by proposing Weighted Spectral Cluster Ensemble (WSCE), which uses modularity from community detection for diversity estimation and eliminates thresholding, achieving superior performance over state-of-the-art methods in experiments on varied datasets.

Clustering explores meaningful patterns in the non-labeled data sets. Cluster Ensemble Selection (CES) is a new approach, which can combine individual clustering results for increasing the performance of the final results. Although CES can achieve better final results in comparison with individual clustering algorithms and cluster ensemble methods, its performance can be dramatically affected by its consensus diversity metric and thresholding procedure. There are two problems in CES: 1) most of the diversity metrics is based on heuristic Shannon's entropy and 2) estimating threshold values are really hard in practice. The main goal of this paper is proposing a robust approach for solving the above mentioned problems. Accordingly, this paper develops a novel framework for clustering problems, which is called Weighted Spectral Cluster Ensemble (WSCE), by exploiting some concepts from community detection arena and graph based clustering. Under this framework, a new version of spectral clustering, which is called Two Kernels Spectral Clustering, is used for generating graphs based individual clustering results. Further, by using modularity, which is a famous metric in the community detection, on the transformed graph representation of individual clustering results, our approach provides an effective diversity estimation for individual clustering results. Moreover, this paper introduces a new approach for combining the evaluated individual clustering results without the procedure of thresholding. Experimental study on varied data sets demonstrates that the prosed approach achieves superior performance to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes