LGJan 15, 2021

Efficient Semi-Implicit Variational Inference

Vincent Moens, Hang Ren, Alexandre Maraval, Rasul Tutunov, Jun Wang, Haitham Ammar

arXiv:2101.06070v19.210 citationsh-index: 47

Originality Incremental advance

AI Analysis

This addresses scalability issues in Bayesian inference for practitioners using deep network models, though it appears incremental as an optimization improvement for an existing SIVI framework.

The paper tackles the computational challenge of semi-implicit variational inference (SIVI) by proposing CI-VI, an efficient solver that achieves an O(t^{-4/5}) gradient-bias-vanishing rate and demonstrates effectiveness in approximating complex posteriors on datasets including natural language processing tasks.

In this paper, we propose CI-VI an efficient and scalable solver for semi-implicit variational inference (SIVI). Our method, first, maps SIVI's evidence lower bound (ELBO) to a form involving a nonlinear functional nesting of expected values and then develops a rigorous optimiser capable of correctly handling bias inherent to nonlinear nested expectations using an extrapolation-smoothing mechanism coupled with gradient sketching. Our theoretical results demonstrate convergence to a stationary point of the ELBO in general non-convex settings typically arising when using deep network models and an order of $O(t^{-\frac{4}{5}})$ gradient-bias-vanishing rate. We believe these results generalise beyond the specific nesting arising from SIVI to other forms. Finally, in a set of experiments, we demonstrate the effectiveness of our algorithm in approximating complex posteriors on various data-sets including those from natural language processing.

View on arXiv PDF

Similar