Eigenvalue Corrected Noisy Natural Gradient
This work addresses a specific limitation in Bayesian deep learning for researchers, but it is incremental as it builds directly on noisy K-FAC.
The paper tackles the problem of inaccurate diagonal variance in matrix-variate Gaussian posteriors for variational Bayesian neural networks by proposing an eigenvalue corrected extension to noisy K-FAC, which consistently outperforms existing algorithms on regression and classification tasks.
Variational Bayesian neural networks combine the flexibility of deep learning with Bayesian uncertainty estimation. However, inference procedures for flexible variational posteriors are computationally expensive. A recently proposed method, noisy natural gradient, is a surprisingly simple method to fit expressive posteriors by adding weight noise to regular natural gradient updates. Noisy K-FAC is an instance of noisy natural gradient that fits a matrix-variate Gaussian posterior with minor changes to ordinary K-FAC. Nevertheless, a matrix-variate Gaussian posterior does not capture an accurate diagonal variance. In this work, we extend on noisy K-FAC to obtain a more flexible posterior distribution called eigenvalue corrected matrix-variate Gaussian. The proposed method computes the full diagonal re-scaling factor in Kronecker-factored eigenbasis. Empirically, our approach consistently outperforms existing algorithms (e.g., noisy K-FAC) on regression and classification tasks.