ML LGSep 6, 2022

Bayesian Neural Network Inference via Implicit Models and the Posterior Predictive Distribution

Joel Janek Dabrowski, Daniel Edward Pagendam

arXiv:2209.02188v12.1h-index: 15

Originality Incremental advance

AI Analysis

This addresses the problem of efficient Bayesian inference for practitioners in fields like surrogate modeling and physics-based simulations, offering a novel method that balances scalability and expressiveness, though it appears incremental relative to existing two-model frameworks.

The paper tackles the challenge of scalable Bayesian inference in complex models like neural networks by proposing a method that uses an implicit model to approximate the posterior distribution, optimizing it via the posterior predictive distribution. The approach is shown to be more scalable than MCMC, more expressive than Variational Inference, and applicable to various likelihood forms, with demonstrations including uncertainty quantification and multi-modality.

We propose a novel approach to perform approximate Bayesian inference in complex models such as Bayesian neural networks. The approach is more scalable to large data than Markov Chain Monte Carlo, it embraces more expressive models than Variational Inference, and it does not rely on adversarial training (or density ratio estimation). We adopt the recent approach of constructing two models: (1) a primary model, tasked with performing regression or classification; and (2) a secondary, expressive (e.g. implicit) model that defines an approximate posterior distribution over the parameters of the primary model. However, we optimise the parameters of the posterior model via gradient descent according to a Monte Carlo estimate of the posterior predictive distribution -- which is our only approximation (other than the posterior model). Only a likelihood needs to be specified, which can take various forms such as loss functions and synthetic likelihoods, thus providing a form of a likelihood-free approach. Furthermore, we formulate the approach such that the posterior samples can either be independent of, or conditionally dependent upon the inputs to the primary model. The latter approach is shown to be capable of increasing the apparent complexity of the primary model. We see this being useful in applications such as surrogate and physics-based models. To promote how the Bayesian paradigm offers more than just uncertainty quantification, we demonstrate: uncertainty quantification, multi-modality, as well as an application with a recent deep forecasting neural network architecture.

View on arXiv PDF

Similar