LGMLMar 3, 2020

Automatic Differentiation Variational Inference with Mixtures

arXiv:2003.01687v431 citations
AI Analysis

This addresses the problem of capturing multimodal structure in latent spaces for probabilistic modeling, though it is incremental as it builds on existing ADVI and IWAE methods.

The paper tackled the limitation of unimodal approximate posteriors in Automatic Differentiation Variational Inference (ADVI) by introducing stratified sampling to enable mixture distributions, resulting in higher accuracy and better calibration with incomplete or corrupted data.

Automatic Differentiation Variational Inference (ADVI) is a useful tool for efficiently learning probabilistic models in machine learning. Generally approximate posteriors learned by ADVI are forced to be unimodal in order to facilitate use of the reparameterization trick. In this paper, we show how stratified sampling may be used to enable mixture distributions as the approximate posterior, and derive a new lower bound on the evidence analogous to the importance weighted autoencoder (IWAE). We show that this "SIWAE" is a tighter bound than both IWAE and the traditional ELBO, both of which are special instances of this bound. We verify empirically that the traditional ELBO objective disfavors the presence of multimodal posterior distributions and may therefore not be able to fully capture structure in the latent space. Our experiments show that using the SIWAE objective allows the encoder to learn more complex distributions which regularly contain multimodality, resulting in higher accuracy and better calibration in the presence of incomplete, limited, or corrupted data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes