Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks with Symmetric Splitting
This work addresses the problem of efficient Bayesian inference for large-scale machine learning, making HMC a more feasible option, though it appears incremental as it builds on existing split HMC methods.
The paper tackled the challenge of scaling Hamiltonian Monte Carlo (HMC) for Bayesian neural networks in large-data regimes by introducing a symmetric integration scheme that avoids stochastic gradients, enabling full HMC on entire datasets with a single GPU and showing better accuracy and uncertainty quantification compared to stochastic gradient MCMC.
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) approach that exhibits favourable exploration properties in high-dimensional models such as neural networks. Unfortunately, HMC has limited use in large-data regimes and little work has explored suitable approaches that aim to preserve the entire Hamiltonian. In our work, we introduce a new symmetric integration scheme for split HMC that does not rely on stochastic gradients. We show that our new formulation is more efficient than previous approaches and is easy to implement with a single GPU. As a result, we are able to perform full HMC over common deep learning architectures using entire data sets. In addition, when we compare with stochastic gradient MCMC, we show that our method achieves better performance in both accuracy and uncertainty quantification. Our approach demonstrates HMC as a feasible option when considering inference schemes for large-scale machine learning problems.