MLLGSTNov 15, 2020

Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee

arXiv:2011.07439v153 citations
AI Analysis

This addresses the problem of storage consumption and sparse structure recovery in deep neural networks for practitioners, offering a theoretically supported and computationally feasible approach.

The paper tackles the lack of theoretical guarantees in sparse deep learning by proposing a fully Bayesian method with spike-and-slab priors, achieving computational efficiency through variational inference and providing a variational posterior contraction rate for consistency.

Sparse deep learning aims to address the challenge of huge storage consumption by deep neural networks, and to recover the sparse structure of target functions. Although tremendous empirical successes have been achieved, most sparse deep learning algorithms are lacking of theoretical support. On the other hand, another line of works have proposed theoretical frameworks that are computationally infeasible. In this paper, we train sparse deep neural networks with a fully Bayesian treatment under spike-and-slab priors, and develop a set of computationally efficient variational inferences via continuous relaxation of Bernoulli distribution. The variational posterior contraction rate is provided, which justifies the consistency of the proposed variational Bayes method. Notably, our empirical results demonstrate that this variational procedure provides uncertainty quantification in terms of Bayesian predictive distribution and is also capable to accomplish consistent variable selection by training a sparse multi-layer neural network.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes