ML ITMar 13, 2014

Neighborhood Selection for Thresholding-based Subspace Clustering

Reinhard Heckel, Eirikur Agustsson, Helmut Bölcskei

arXiv:1403.3438v110 citations

Originality Synthesis-oriented

AI Analysis

This work addresses subspace clustering for high-dimensional data analysis, but it is incremental as it builds on existing TSC methods with a specific modification and analysis.

The paper tackles the problem of subspace clustering by proposing a modified thresholding-based subspace clustering (TSC) algorithm that uses a data-driven method to select the number of nearest neighbors, and it provides a direct performance analysis of the clustering error for both the modified and original TSC algorithms.

Subspace clustering refers to the problem of clustering high-dimensional data points into a union of low-dimensional linear subspaces, where the number of subspaces, their dimensions and orientations are all unknown. In this paper, we propose a variation of the recently introduced thresholding-based subspace clustering (TSC) algorithm, which applies spectral clustering to an adjacency matrix constructed from the nearest neighbors of each data point with respect to the spherical distance measure. The new element resides in an individual and data-driven choice of the number of nearest neighbors. Previous performance results for TSC, as well as for other subspace clustering algorithms based on spectral clustering, come in terms of an intermediate performance measure, which does not address the clustering error directly. Our main analytical contribution is a performance analysis of the modified TSC algorithm (as well as the original TSC algorithm) in terms of the clustering error directly.

View on arXiv PDF

Similar