Neural Variational Gradient Descent
This addresses a key bottleneck in particle-based Bayesian inference methods for researchers and practitioners, offering a more automated approach, though it appears incremental as it builds directly on SVGD.
The paper tackles the challenge of kernel selection in Stein Variational Gradient Descent (SVGD) by proposing Neural Variational Gradient Descent (NVGD), which uses a deep neural network to parameterize the Stein discrepancy witness function, eliminating the need for kernel choices. It demonstrates empirical evaluation on synthetic and real-world Bayesian inference tasks.
Particle-based approximate Bayesian inference approaches such as Stein Variational Gradient Descent (SVGD) combine the flexibility and convergence guarantees of sampling methods with the computational benefits of variational inference. In practice, SVGD relies on the choice of an appropriate kernel function, which impacts its ability to model the target distribution -- a challenging problem with only heuristic solutions. We propose Neural Variational Gradient Descent (NVGD), which is based on parameterizing the witness function of the Stein discrepancy by a deep neural network whose parameters are learned in parallel to the inference, mitigating the necessity to make any kernel choices whatsoever. We empirically evaluate our method on popular synthetic inference problems, real-world Bayesian linear regression, and Bayesian neural network inference.