ITCVLGMLMay 4, 2015

Self-Expressive Decompositions for Matrix Approximation and Clustering

arXiv:1505.00824v118 citations
Originality Incremental advance
AI Analysis

This work addresses data-aware dimensionality reduction and clustering for researchers and practitioners, offering an incremental improvement over existing self-expressive methods.

The paper tackles the problem of dimensionality reduction and matrix decomposition by introducing SEED, a scalable greedy method that selects incoherent vectors from the dataset to form a basis and compute sparse representations, showing it can exactly represent low-rank matrices and vectors from unions of independent subspaces, with results indicating it is a low-complexity alternative to methods like sparse PCA.

Data-aware methods for dimensionality reduction and matrix decomposition aim to find low-dimensional structure in a collection of data. Classical approaches discover such structure by learning a basis that can efficiently express the collection. Recently, "self expression", the idea of using a small subset of data vectors to represent the full collection, has been developed as an alternative to learning. Here, we introduce a scalable method for computing sparse SElf-Expressive Decompositions (SEED). SEED is a greedy method that constructs a basis by sequentially selecting incoherent vectors from the dataset. After forming a basis from a subset of vectors in the dataset, SEED then computes a sparse representation of the dataset with respect to this basis. We develop sufficient conditions under which SEED exactly represents low rank matrices and vectors sampled from a unions of independent subspaces. We show how SEED can be used in applications ranging from matrix approximation and denoising to clustering, and apply it to numerous real-world datasets. Our results demonstrate that SEED is an attractive low-complexity alternative to other sparse matrix factorization approaches such as sparse PCA and self-expressive methods for clustering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes