ST LG MLOct 23, 2020

Statistical Guarantees for Transformation Based Models with Applications to Implicit Variational Inference

Sean Plummer, Shuang Zhou, Anirban Bhattacharya, David Dunson, Debdeep Pati

arXiv:2010.14056v22.31 citations

Originality Highly original

AI Analysis

This work provides foundational theoretical guarantees for implicit variational inference, addressing a key gap in machine learning methodology.

The authors tackled the lack of theoretical justification for transformation-based models in non-parametric inference and variational inference by proving that non-linear latent variable models have large support and achieve optimal posterior concentration rates, and they introduced GP-IVI with optimal risk bounds and KL divergence guarantees for implicit variational inference.

Transformation-based methods have been an attractive approach in non-parametric inference for problems such as unconditional and conditional density estimation due to their unique hierarchical structure that models the data as flexible transformation of a set of common latent variables. More recently, transformation-based models have been used in variational inference (VI) to construct flexible implicit families of variational distributions. However, their use in both non-parametric inference and variational inference lacks theoretical justification. We provide theoretical justification for the use of non-linear latent variable models (NL-LVMs) in non-parametric inference by showing that the support of the transformation induced prior in the space of densities is sufficiently large in the $L_1$ sense. We also show that, when a Gaussian process (GP) prior is placed on the transformation function, the posterior concentrates at the optimal rate up to a logarithmic factor. Adopting the flexibility demonstrated in the non-parametric setting, we use the NL-LVM to construct an implicit family of variational distributions, deemed GP-IVI. We delineate sufficient conditions under which GP-IVI achieves optimal risk bounds and approximates the true posterior in the sense of the Kullback-Leibler divergence. To the best of our knowledge, this is the first work on providing theoretical guarantees for implicit variational inference.

View on arXiv PDF

Similar