Partitioning Well-Clustered Graphs: Spectral Clustering Works!
This work addresses graph partitioning for applications in data analysis and machine learning, offering theoretical guarantees and efficient algorithms, but it is incremental as it extends known results to more general graphs.
The paper tackles the problem of partitioning well-clustered graphs using spectral clustering, showing that it provides a good approximation to optimal clustering for a broad class of graphs, with results previously only known for stochastic models, and introduces a nearly-linear time algorithm based on matrix exponentials and approximate nearest neighbor structures.
In this paper we study variants of the widely used spectral clustering that partitions a graph into k clusters by (1) embedding the vertices of a graph into a low-dimensional space using the bottom eigenvectors of the Laplacian matrix, and (2) grouping the embedded points into k clusters via k-means algorithms. We show that, for a wide class of graphs, spectral clustering gives a good approximation of the optimal clustering. While this approach was proposed in the early 1990s and has comprehensive applications, prior to our work similar results were known only for graphs generated from stochastic models. We also give a nearly-linear time algorithm for partitioning well-clustered graphs based on computing a matrix exponential and approximate nearest neighbor data structures.