Amortized Variational Inference for Simple Hierarchical Models
This addresses the problem of slow and infeasible inference in large-scale hierarchical models for researchers and practitioners in machine learning, representing an incremental improvement over existing methods.
The paper tackled the challenge of scaling variational inference in hierarchical models by proposing an amortized approach that uses shared parameters to represent all local distributions, achieving similar accuracy to joint distributions while enabling inference on datasets several orders of magnitude larger and being dramatically faster than structured variational distributions.
It is difficult to use subsampling with variational inference in hierarchical models since the number of local latent variables scales with the dataset. Thus, inference in hierarchical models remains a challenge at large scale. It is helpful to use a variational family with structure matching the posterior, but optimization is still slow due to the huge number of local distributions. Instead, this paper suggests an amortized approach where shared parameters simultaneously represent all local distributions. This approach is similarly accurate as using a given joint distribution (e.g., a full-rank Gaussian) but is feasible on datasets that are several orders of magnitude larger. It is also dramatically faster than using a structured variational distribution.