Replica-exchange Nosé-Hoover dynamics for Bayesian learning on large datasets
This method addresses the challenge of Bayesian inference for large-scale machine learning applications, though it appears incremental as it builds on existing replica-exchange and Nosé-Hoover techniques.
The paper tackles the problem of efficiently sampling from complex multimodal posterior distributions in Bayesian learning on large datasets, achieving significant improvements over strong baselines in experiments with deep Bayesian neural networks.
In this paper, we present a new practical method for Bayesian learning that can rapidly draw representative samples from complex posterior distributions with multiple isolated modes in the presence of mini-batch noise. This is achieved by simulating a collection of replicas in parallel with different temperatures and periodically swapping them. When evolving the replicas' states, the Nosé-Hoover dynamics is applied, which adaptively neutralizes the mini-batch noise. To perform proper exchanges, a new protocol is developed with a noise-aware test of acceptance, by which the detailed balance is reserved in an asymptotic way. While its efficacy on complex multimodal posteriors has been illustrated by testing over synthetic distributions, experiments with deep Bayesian neural networks on large-scale datasets have shown its significant improvements over strong baselines.