LG MLFeb 6, 2019

An Automated Spectral Clustering for Multi-scale Data

arXiv:1902.01990v12.726 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of clustering multi-scale and high-dimensional data without manual parameter tuning, though it appears incremental as it builds on existing spectral clustering methods.

The study tackled the problem of automating parameter selection in spectral clustering for multi-scale data by introducing a heuristic algorithm that estimates parameters from the data itself, achieving over 90% accuracy in most cases without prior input.

Spectral clustering algorithms typically require a priori selection of input parameters such as the number of clusters, a scaling parameter for the affinity measure, or ranges of these values for parameter tuning. Despite efforts for automating the process of spectral clustering, the task of grouping data in multi-scale and higher dimensional spaces is yet to be explored. This study presents a spectral clustering heuristic algorithm that obviates the need for an input by estimating the parameters from the data itself. Specifically, it introduces the heuristic of iterative eigengap search with (1) global scaling and (2) local scaling. These approaches estimate the scaling parameter and implement iterative eigengap quantification along a search tree to reveal dissimilarities at different scales of a feature space and identify clusters. The performance of these approaches has been tested on various real-world datasets of power variation with multi-scale nature and gene expression. Our findings show that iterative eigengap search with a PCA-based global scaling scheme can discover different patterns with an accuracy of higher than 90% in most cases without asking for a priori input information.

View on arXiv PDF

Similar