DS LG MLJan 3, 2014

More Algorithms for Provable Dictionary Learning

Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

arXiv:1401.0579v156 citations

Originality Incremental advance

AI Analysis

This addresses a key bottleneck in sparse coding for applications like neuroscience and image processing, though it is incremental as it builds on prior provable algorithms.

The paper tackles the problem of designing provable dictionary learning algorithms that allow sparsity significantly greater than √n in the hidden vector, achieving sparsity up to n/poly(log n) for a class of matrices with individually recoverable features.

In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$ (usually $m > n$, which is the overcomplete case). The goal is to learn $A$ and $x$. This problem has been studied in neuroscience, machine learning, visions, and image processing. In practice it is solved by heuristic algorithms and provable algorithms seemed hard to find. Recently, provable algorithms were found that work if the unknown feature vector $x$ is $\sqrt{n}$-sparse or even sparser. Spielman et al. \cite{DBLP:journals/jmlr/SpielmanWW12} did this for dictionaries where $m=n$; Arora et al. \cite{AGM} gave an algorithm for overcomplete ($m >n$) and incoherent matrices $A$; and Agarwal et al. \cite{DBLP:journals/corr/AgarwalAN13} handled a similar case but with weaker guarantees. This raised the problem of designing provable algorithms that allow sparsity $\gg \sqrt{n}$ in the hidden vector $x$. The current paper designs algorithms that allow sparsity up to $n/poly(\log n)$. It works for a class of matrices where features are individually recoverable, a new notion identified in this paper that may motivate further work. The algorithm runs in quasipolynomial time because they use limited enumeration.

View on arXiv PDF

Similar