From Bayesian Sparsity to Gated Recurrent Nets
This work addresses sparse estimation problems in domains such as signal processing and geometry recovery, offering a learning-based upgrade to existing algorithms, though it appears incremental by building on prior connections between optimization and neural networks.
The paper tackles the problem of sparse estimation by showing that complex Bayesian algorithms for sparsity mirror the structure of sophisticated LSTM or gated recurrent networks, leading to a novel system that efficiently estimates optimal solutions in challenging regimes like direction-of-arrival and 3D geometry recovery, where other algorithms fail.
The iterations of many first-order algorithms, when applied to minimizing common regularized regression functions, often resemble neural network layers with pre-specified weights. This observation has prompted the development of learning-based approaches that purport to replace these iterations with enhanced surrogates forged as DNN models from available training data. For example, important NP-hard sparse estimation problems have recently benefitted from this genre of upgrade, with simple feedforward or recurrent networks ousting proximal gradient-based iterations. Analogously, this paper demonstrates that more powerful Bayesian algorithms for promoting sparsity, which rely on complex multi-loop majorization-minimization techniques, mirror the structure of more sophisticated long short-term memory (LSTM) networks, or alternative gated feedback networks previously designed for sequence prediction. As part of this development, we examine the parallels between latent variable trajectories operating across multiple time-scales during optimization, and the activations within deep network structures designed to adaptively model such characteristic sequences. The resulting insights lead to a novel sparse estimation system that, when granted training data, can estimate optimal solutions efficiently in regimes where other algorithms fail, including practical direction-of-arrival (DOA) and 3D geometry recovery problems. The underlying principles we expose are also suggestive of a learning process for a richer class of multi-loop algorithms in other domains.