An Improved and Generalised Analysis for Spectral Clustering
This provides a more general theoretical foundation for Spectral Clustering, applicable to domains like ecological networks, but is incremental as it builds on existing analyses.
The paper tackles the theoretical analysis of Spectral Clustering by showing it works well when eigenvalues are grouped and separated, extending to hierarchical clusters and directed graphs, with validation on synthetic and real-world data.
We revisit the theoretical performances of Spectral Clustering, a classical algorithm for graph partitioning that relies on the eigenvectors of a matrix representation of the graph. Informally, we show that Spectral Clustering works well as long as the smallest eigenvalues appear in groups well separated from the rest of the matrix representation's spectrum. This arises, for example, whenever there exists a hierarchy of clusters at different scales, a regime not captured by previous analyses. Our results are very general and can be applied beyond the traditional graph Laplacian. In particular, we study Hermitian representations of digraphs and show Spectral Clustering can recover partitions where edges between clusters are oriented mostly in the same direction. This has applications in, for example, the analysis of trophic levels in ecological networks. We demonstrate that our results accurately predict the performances of Spectral Clustering on synthetic and real-world data sets.