A note on the sample complexity of the Er-SpUD algorithm by Spielman, Wang and Wright for exact recovery of sparsely used dictionaries
This provides an optimal sample complexity bound for dictionary recovery in sparse coding, which is incremental as it refines prior theoretical results.
The paper tackles the problem of exactly recovering an invertible matrix A and a sparse random matrix X from observations Y=AX, showing that a version of the Er-SpUD algorithm achieves exact recovery with high probability when p ≥ Cn log n, which is optimal up to the constant C.
We consider the problem of recovering an invertible $n \times n$ matrix $A$ and a sparse $n \times p$ random matrix $X$ based on the observation of $Y = AX$ (up to a scaling and permutation of columns of $A$ and rows of $X$). Using only elementary tools from the theory of empirical processes we show that a version of the Er-SpUD algorithm by Spielman, Wang and Wright with high probability recovers $A$ and $X$ exactly, provided that $p \ge Cn\log n$, which is optimal up to the constant $C$.