LGAIJul 23, 2021

A Simple Approach to Automated Spectral Clustering

arXiv:2107.12183v432 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing manual hyperparameter tuning and method selection in spectral clustering for practitioners, though it is incremental as it builds on existing affinity-matrix-construction techniques.

The authors tackled the challenge of automating spectral clustering by introducing a method to select the most reliable affinity matrix using a new metric called relative-eigen-gap, and proposed a fast affinity-matrix-construction method based on least squares representation. Their approach achieved more versatile, accurate, and efficient results in natural image clustering experiments compared to baseline methods.

The performance of spectral clustering heavily relies on the quality of affinity matrix. A variety of affinity-matrix-construction (AMC) methods have been proposed but they have hyperparameters to determine beforehand, which requires strong experience and leads to difficulty in real applications, especially when the inter-cluster similarity is high and/or the dataset is large. In addition, we often need to choose different AMC methods for different datasets, which still depends on experience. To solve these two challenging problems, in this paper, we present a simple yet effective method for automated spectral clustering. First, we propose to find the most reliable affinity matrix via grid search or Bayesian optimization among a set of candidates given by different AMC methods with different hyperparameters, where the reliability is quantified by the \textit{relative-eigen-gap} of graph Laplacian introduced in this paper. Second, we propose a fast and accurate AMC method based on least squares representation and thresholding and prove its effectiveness theoretically. Finally, we provide a large-scale extension for the automated spectral clustering method, of which the time complexity is linear with the number of data points. Extensive experiments of natural image clustering show that our method is more versatile, accurate, and efficient than baseline methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes