MLSTOct 5, 2013

Role of normalization in spectral clustering for stochastic blockmodels

arXiv:1310.1495v2106 citations
Originality Incremental advance
AI Analysis

This provides theoretical justification for a common practice in clustering, addressing a gap in understanding for researchers in machine learning and statistics.

The paper tackles the problem of understanding why normalization improves spectral clustering accuracy in stochastic blockmodels, and theoretically shows that normalization shrinks the spread of points within a class by a constant fraction under broad parameters.

Spectral clustering is a technique that clusters elements using the top few eigenvectors of their (possibly normalized) similarity matrix. The quality of spectral clustering is closely tied to the convergence properties of these principal eigenvectors. This rate of convergence has been shown to be identical for both the normalized and unnormalized variants in recent random matrix theory literature. However, normalization for spectral clustering is commonly believed to be beneficial [Stat. Comput. 17 (2007) 395-416]. Indeed, our experiments show that normalization improves prediction accuracy. In this paper, for the popular stochastic blockmodel, we theoretically show that normalization shrinks the spread of points in a class by a constant fraction under a broad parameter regime. As a byproduct of our work, we also obtain sharp deviation bounds of empirical principal eigenvalues of graphs generated from a stochastic blockmodel.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes