DS LG SIJun 7, 2020

Average Sensitivity of Spectral Clustering

arXiv:2006.04094v18.627 citations

Originality Incremental advance

AI Analysis

This addresses reliability concerns for data mining applications where graphs have missing edges, but it is incremental as it builds on existing spectral clustering theory.

The paper tackles the problem of assessing the stability of spectral clustering against edge perturbations in graphs, proving that average sensitivity is proportional to λ₂/λ₃² and confirming this empirically on synthetic and real networks, showing stability when cluster structure exists.

Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in measurement, withholding for a privacy reason, or arbitrariness in data conversion. To make reliable and efficient decisions based on spectral clustering, we assess the stability of spectral clustering against edge perturbations in the input graph using the notion of average sensitivity, which is the expected size of the symmetric difference of the output clusters before and after we randomly remove edges. We first prove that the average sensitivity of spectral clustering is proportional to $λ_2/λ_3^2$, where $λ_i$ is the $i$-th smallest eigenvalue of the (normalized) Laplacian. We also prove an analogous bound for $k$-way spectral clustering, which partitions the graph into $k$ clusters. Then, we empirically confirm our theoretical bounds by conducting experiments on synthetic and real networks. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.

View on arXiv PDF

Similar