Pseudo-Encoded Stochastic Variational Inference
This addresses the inference complexity problem for users of variational inference in directed graphical models, offering a significant speed-up while maintaining performance, though it is incremental as it builds on SVI.
The paper tackles the computational cost of Stochastic Variational Inference (SVI) during test time by introducing Pseudo-Encoded SVI (PE-SVI), which reduces the required gradient steps by finding a better initialization point and allowing larger step sizes, achieving the same ELBo objective with less than 1% of the steps on average.
Posterior inference in directed graphical models is commonly done using a probabilistic encoder (a.k.a inference model) conditioned on the input. Often this inference model is trained jointly with the probabilistic decoder (a.k.a generator model). If probabilistic encoder encounters complexities during training (e.g. suboptimal complxity or parameterization), then learning reaches a suboptimal objective; a phenomena commonly called inference suboptimality. In Variational Inference (VI), optimizing the ELBo using Stochastic Variational Inference (SVI) can eliminate the inference suboptimality (as demonstrated in this paper), however, this solution comes at a substantial computational cost when inference needs to be done on new data points. Essentially, a long sequential chain of gradient updates is required to fully optimize approximate posteriors. In this paper, we present an approach called Pseudo-Encoded Stochastic Variational Inference (PE-SVI), to reduce the inference complexity of SVI during test time. Our approach relies on finding a suitable initial start point for gradient operations, which naturally reduces the required gradient steps. Furthermore, this initialization allows for adopting larger step sizes (compared to random initialization used in SVI), which further reduces the inference time complexity. PE-SVI reaches the same ELBo objective as SVI using less than one percent of required steps, on average.