DS LGMar 5, 2015

Scalable Iterative Algorithm for Robust Subspace Clustering

Sanghyuk Chun, Yung-Kyun Noh, Jinwoo Shin

arXiv:1503.01578v22.31 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of slow robust subspace clustering for researchers and practitioners dealing with high-dimensional large-scale data, representing an incremental improvement over existing methods.

The paper tackles the computational inefficiency of robust subspace clustering methods by developing a faster iterative algorithm that uses a non-squared ℓ₂-norm objective and simplifies subspace updates, achieving an order of magnitude faster convergence and outperforming prior methods in accuracy and running time on the MNIST dataset.

Subspace clustering (SC) is a popular method for dimensionality reduction of high-dimensional data, where it generalizes Principal Component Analysis (PCA). Recently, several methods have been proposed to enhance the robustness of PCA and SC, while most of them are computationally very expensive, in particular, for high dimensional large-scale data. In this paper, we develop much faster iterative algorithms for SC, incorporating robustness using a {\em non-squared} $\ell_2$-norm objective. The known implementations for optimizing the objective would be costly due to the alternative optimization of two separate objectives: optimal cluster-membership assignment and robust subspace selection, while the substitution of one process to a faster surrogate can cause failure in convergence. To address the issue, we use a simplified procedure requiring efficient matrix-vector multiplications for subspace update instead of solving an expensive eigenvector problem at each iteration, in addition to release nested robust PCA loops. We prove that the proposed algorithm monotonically converges to a local minimum with approximation guarantees, e.g., it achieves 2-approximation for the robust PCA objective. In our experiments, the proposed algorithm is shown to converge at an order of magnitude faster than known algorithms optimizing the same objective, and have outperforms prior subspace clustering methods in accuracy and running time for MNIST dataset.

View on arXiv PDF

Similar