LGJan 29

Investigating Batch Inference in a Sequential Monte Carlo Framework for Neural Networks

Andrew Millard, Joshua Murphy, Peter Green, Simon Maskell

arXiv:2601.21983v11.4h-index: 6

Originality Incremental advance

AI Analysis

This work addresses the computational bottleneck in Bayesian inference for neural networks, offering a more efficient method for researchers and practitioners, though it is incremental as it builds on existing SMC techniques.

The paper tackled the computational expense of Sequential Monte Carlo (SMC) samplers for Bayesian neural networks by exploring data annealing methods that gradually introduce mini-batches, achieving up to 6× faster training with minimal accuracy loss on benchmark image classification tasks.

Bayesian inference allows us to define a posterior distribution over the weights of a generic neural network (NN). Exact posteriors are usually intractable, in which case approximations can be employed. One such approximation - variational inference - is computationally efficient when using mini-batch stochastic gradient descent as subsets of the data are used for likelihood and gradient evaluations, though the approach relies on the selection of a variational distribution which sufficiently matches the form of the posterior. Particle-based methods such as Markov chain Monte Carlo and Sequential Monte Carlo (SMC) do not assume a parametric family for the posterior by typically require higher computational cost. These sampling methods typically use the full-batch of data for likelihood and gradient evaluations, which contributes to this computational expense. We explore several methods of gradually introducing more mini-batches of data (data annealing) into likelihood and gradient evaluations of an SMC sampler. We find that we can achieve up to $6\times$ faster training with minimal loss in accuracy on benchmark image classification problems using NNs.

View on arXiv PDF

Similar