Hierarchical VAE with a Diffusion-based VampPrior
This work addresses a specific bottleneck in deep generative modeling for researchers, offering an incremental improvement over existing hierarchical VAE methods.
The paper tackles the challenge of scaling hierarchical variational autoencoders by introducing a diffusion-based VampPrior with amortization, achieving better performance and training stability on benchmark datasets like MNIST, OMNIGLOT, and CIFAR10 while using fewer parameters.
Deep hierarchical variational autoencoders (VAEs) are powerful latent variable generative models. In this paper, we introduce Hierarchical VAE with Diffusion-based Variational Mixture of the Posterior Prior (VampPrior). We apply amortization to scale the VampPrior to models with many stochastic layers. The proposed approach allows us to achieve better performance compared to the original VampPrior work and other deep hierarchical VAEs, while using fewer parameters. We empirically validate our method on standard benchmark datasets (MNIST, OMNIGLOT, CIFAR10) and demonstrate improved training stability and latent space utilization.