ML CE LG DATA-ANJul 19, 2025

Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators

Ponkrshnan Thiagarajan, Tamer A. Zaki, Michael D. Shields

arXiv:2507.14652v210.34 citationsh-index: 40Comput Method Appl Mech Eng

Originality Incremental advance

AI Analysis

This work addresses the problem of slow and inaccurate uncertainty estimation in Bayesian neural networks for researchers and practitioners in machine learning and scientific computing, offering an incremental improvement over existing methods.

The paper tackles the computational inefficiency of Hamiltonian Monte Carlo (HMC) for Bayesian inference in neural networks by proposing a hybrid method that uses variational inference to identify and reduce the parameter space, enabling faster HMC sampling. It demonstrates the approach on networks with tens to hundreds of thousands of parameters, achieving efficient uncertainty quantification for complex physical systems like hypersonic flow modeling.

Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network's parameter space and the non-convexity of their posterior distributions. Therefore, various approximation techniques, such as variational inference (VI) or stochastic gradient MCMC, are often employed to infer the posterior distribution of the network parameters. Such approximations introduce inaccuracies in the inferred distributions, resulting in unreliable uncertainty estimates. In this work, we propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently and accurately quantify uncertainties in neural networks and neural operators. The proposed approach leverages an initial VI training on the full network. We examine the influence of individual parameters on the prediction uncertainty, which shows that a large proportion of the parameters do not contribute substantially to uncertainty in the network predictions. This information is then used to significantly reduce the dimension of the parameter space, and HMC is performed only for the subset of network parameters that strongly influence prediction uncertainties. This yields a framework for accelerating the full batch HMC for posterior inference in neural networks. We demonstrate the efficiency and accuracy of the proposed framework on deep neural networks and operator networks, showing that inference can be performed for large networks with tens to hundreds of thousands of parameters. We show that this method can effectively learn surrogates for complex physical systems by modeling the operator that maps from upstream conditions to wall-pressure data on a cone in hypersonic flow.

View on arXiv PDF

Similar