LGAICVAug 17, 2024

Learning to Explore for Stochastic Gradient MCMC

arXiv:2408.09140v13 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the computational expense of Bayesian inference for high-dimensional neural networks, offering an incremental improvement in sampling efficiency for practitioners in machine learning.

The paper tackles the challenge of efficiently exploring multi-modal posterior distributions in Bayesian Neural Networks using Stochastic Gradient MCMC, proposing a meta-learning strategy that significantly improves sampling efficiency and achieves better performance on image classification benchmarks without significant computational overhead.

Bayesian Neural Networks(BNNs) with high-dimensional parameters pose a challenge for posterior inference due to the multi-modality of the posterior distributions. Stochastic Gradient MCMC(SGMCMC) with cyclical learning rate scheduling is a promising solution, but it requires a large number of sampling steps to explore high-dimensional multi-modal posteriors, making it computationally expensive. In this paper, we propose a meta-learning strategy to build \gls{sgmcmc} which can efficiently explore the multi-modal target distributions. Our algorithm allows the learned SGMCMC to quickly explore the high-density region of the posterior landscape. Also, we show that this exploration property is transferrable to various tasks, even for the ones unseen during a meta-training stage. Using popular image classification benchmarks and a variety of downstream tasks, we demonstrate that our method significantly improves the sampling efficiency, achieving better performance than vanilla \gls{sgmcmc} without incurring significant computational overhead.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes