Noisy Subspace Clustering via Thresholding
This addresses subspace clustering in noisy environments for data analysis applications, but it appears incremental as it builds on prior work.
The paper tackles the problem of clustering noisy high-dimensional data into unknown low-dimensional subspaces with outliers, showing that the thresholding-based subspace clustering (TSC) algorithm succeeds even when subspaces intersect, with an explicit tradeoff between noise level and subspace affinity.
We consider the problem of clustering noisy high-dimensional data points into a union of low-dimensional subspaces and a set of outliers. The number of subspaces, their dimensions, and their orientations are unknown. A probabilistic performance analysis of the thresholding-based subspace clustering (TSC) algorithm introduced recently in [1] shows that TSC succeeds in the noisy case, even when the subspaces intersect. Our results reveal an explicit tradeoff between the allowed noise level and the affinity of the subspaces. We furthermore find that the simple outlier detection scheme introduced in [1] provably succeeds in the noisy case.