MLLGMar 1, 2021

The Mathematics Behind Spectral Clustering And The Equivalence To PCA

arXiv:2103.00733v12 citations
Originality Synthesis-oriented
AI Analysis

It provides theoretical clarity for researchers in machine learning and data analysis, but it is incremental as it builds on existing spectral clustering theory.

This paper tackles the problem of explaining the mathematical foundations of spectral clustering by categorizing it based on graph connectivity and demonstrating equivalence to PCA. It proves that for fully connected graphs, spectral embedding minimizes a covariance objective, and for multi-connected graphs, eigenvectors indicate connected components.

Spectral clustering is a popular algorithm that clusters points using the eigenvalues and eigenvectors of Laplacian matrices derived from the data. For years, spectral clustering has been working mysteriously. This paper explains spectral clustering by dividing it into two categories based on whether the graph Laplacian is fully connected or not. For a fully connected graph, this paper demonstrates the dimension reduction part by offering an objective function: the covariance between the original data points' similarities and the mapped data points' similarities. For a multi-connected graph, this paper proves that with a proper $k$, the first $k$ eigenvectors are the indicators of the connected components. This paper also proves there is an equivalence between spectral embedding and PCA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes