LGMLJul 21, 2020

Spectral Clustering using Eigenspectrum Shape Based Nystrom Sampling

arXiv:2007.11416v11 citations
Originality Incremental advance
AI Analysis

This work addresses scalability issues in spectral clustering for large datasets, but it appears incremental as it builds on existing Nyström methods with a new sampling approach.

The paper tackles the computational complexity of spectral clustering for large-scale data by proposing a scalable Nyström-based algorithm with a new sampling procedure called CMS3 and a heuristic based on eigen spectrum shape, achieving competitive low-rank approximations compared to state-of-the-art methods.

Spectral clustering has shown a superior performance in analyzing the cluster structure. However, its computational complexity limits its application in analyzing large-scale data. To address this problem, many low-rank matrix approximating algorithms are proposed, including the Nystrom method - an approach with proven approximate error bounds. There are several algorithms that provide recipes to construct Nystrom approximations with variable accuracies and computing times. This paper proposes a scalable Nystrom-based clustering algorithm with a new sampling procedure, Centroid Minimum Sum of Squared Similarities (CMS3), and a heuristic on when to use it. Our heuristic depends on the eigen spectrum shape of the dataset, and yields competitive low-rank approximations in test datasets compared to the other state-of-the-art methods

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes