Leveraging Nested MLMC for Sequential Neural Posterior Estimation with Intractable Likelihoods
This work addresses a convergence analysis bottleneck in SNPE techniques for simulation-based models, which is incremental but important for researchers in Bayesian inference and computational statistics.
The paper tackles the challenge of analyzing convergence in sequential neural posterior estimation (SNPE) for models with intractable likelihoods by reformulating the APT method as a nested estimation problem and constructing multilevel Monte Carlo (MLMC) estimators for the loss function and gradients, resulting in methods that include unbiased and biased estimators with controlled runtime and memory usage, as demonstrated in numerical experiments for approximating complex posteriors with multimodality in moderate dimensions.
There is a growing interest in studying sequential neural posterior estimation (SNPE) techniques due to their advantages for simulation-based models with intractable likelihoods. The methods aim to learn the posterior from adaptively proposed simulations using neural network-based conditional density estimators. As an SNPE technique, the automatic posterior transformation (APT) method proposed by Greenberg et al. (2019) performs well and scales to high-dimensional data. However, the APT method requires computing the expectation of the logarithm of an intractable normalizing constant, i.e., a nested expectation. Although atomic proposals were used to render an analytical normalizing constant, it remains challenging to analyze the convergence of learning. In this paper, we reformulate APT as a nested estimation problem. Building on this, we construct several multilevel Monte Carlo (MLMC) estimators for the loss function and its gradients to accommodate different scenarios, including two unbiased estimators, and a biased estimator that trades a small bias for reduced variance and controlled runtime and memory usage. We also provide convergence results of stochastic gradient descent to quantify the interaction of the bias and variance of the gradient estimator. Numerical experiments for approximating complex posteriors with multimodality in moderate dimensions are provided to examine the effectiveness of the proposed methods.