Local stability and robustness of sparse dictionary learning in the presence of noise
This provides theoretical stability for sparse coding methods used in signal processing and machine learning, addressing a known bottleneck but is incremental as it extends prior noiseless or under-complete analyses.
The paper tackles the lack of theoretical guarantees for sparse dictionary learning by proving that, under a probabilistic model, the non-convex optimization has a local minimum near the true dictionary with high probability, even for over-complete dictionaries and noisy signals, with non-asymptotic bounds on how noise and coherence scale with problem dimensions.
A popular approach within the signal processing and machine learning communities consists in modelling signals as sparse linear combinations of atoms selected from a learned dictionary. While this paradigm has led to numerous empirical successes in various fields ranging from image to audio processing, there have only been a few theoretical arguments supporting these evidences. In particular, sparse coding, or sparse dictionary learning, relies on a non-convex procedure whose local minima have not been fully analyzed yet. In this paper, we consider a probabilistic model of sparse signals, and show that, with high probability, sparse coding admits a local minimum around the reference dictionary generating the signals. Our study takes into account the case of over-complete dictionaries and noisy signals, thus extending previous work limited to noiseless settings and/or under-complete dictionaries. The analysis we conduct is non-asymptotic and makes it possible to understand how the key quantities of the problem, such as the coherence or the level of noise, can scale with respect to the dimension of the signals, the number of atoms, the sparsity and the number of observations.