LG AI MLNov 11, 2018

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Aaron Mishkin, Frederik Kunstner, Didrik Nielsen, Mark Schmidt, Mohammad Emtiyaz Khan

arXiv:1811.04504v218.367 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of poor uncertainty estimates in Bayesian deep learning for practitioners, offering a more efficient alternative to existing approximations, though it is incremental as it builds on natural gradient and low-rank methods.

The paper tackles the computational challenge of uncertainty estimation in large deep-learning models by proposing SLANG, a stochastic, low-rank, approximate natural-gradient method for variational inference, which enables faster and more accurate uncertainty estimation than mean-field methods and performs comparably to state-of-the-art methods on standard benchmarks.

Uncertainty estimation in large deep-learning models is a computationally challenging task, where it is difficult to form even a Gaussian approximation to the posterior distribution. In such situations, existing methods usually resort to a diagonal approximation of the covariance matrix despite, the fact that these matrices are known to result in poor uncertainty estimates. To address this issue, we propose a new stochastic, low-rank, approximate natural-gradient (SLANG) method for variational inference in large, deep models. Our method estimates a "diagonal plus low-rank" structure based solely on back-propagated gradients of the network log-likelihood. This requires strictly less gradient computations than methods that compute the gradient of the whole variational objective. Empirical evaluations on standard benchmarks confirm that SLANG enables faster and more accurate estimation of uncertainty than mean-field methods, and performs comparably to state-of-the-art methods.

View on arXiv PDF Code

Similar