Finding a sparse vector in a subspace: Linear sparsity using alternating directions
This addresses a fundamental bottleneck in sparse recovery for signal processing and machine learning, with applications in sparse dictionary learning and sparse PCA, though it is incremental as it builds on existing planted sparse models.
The paper tackles the problem of finding the sparsest vector in a generic subspace, a homogeneous variant of sparse recovery, and demonstrates that a nonconvex alternating directions method succeeds with a fraction of nonzero entries scaling linearly (Ω(1)), outperforming convex heuristics that fail beyond O(1/√n).
Is it possible to find the sparsest vector (direction) in a generic subspace $\mathcal{S} \subseteq \mathbb{R}^p$ with $\mathrm{dim}(\mathcal{S})= n < p$? This problem can be considered a homogeneous variant of the sparse recovery problem, and finds connections to sparse dictionary learning, sparse PCA, and many other problems in signal processing and machine learning. In this paper, we focus on a **planted sparse model** for the subspace: the target sparse vector is embedded in an otherwise random subspace. Simple convex heuristics for this planted recovery problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds $O(1/\sqrt{n})$. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is $Ω(1)$. To the best of our knowledge, this is the first practical algorithm to achieve linear scaling under the planted sparse model. Empirically, our proposed algorithm also succeeds in more challenging data models, e.g., sparse dictionary learning.