"Best-of-Many-Samples" Distribution Matching
This addresses a key challenge in generative modeling for AI researchers, offering a more stable and effective hybrid approach, though it is incremental on prior VAE-GAN work.
The paper tackles the mode collapse problem in GANs and poor sample quality in VAEs by proposing a hybrid VAE-GAN framework with a 'Best-of-Many-Samples' reconstruction cost and stable synthetic likelihood estimation, achieving significant improvements in mode coverage and quality over existing methods.
Generative Adversarial Networks (GANs) can achieve state-of-the-art sample quality in generative modelling tasks but suffer from the mode collapse problem. Variational Autoencoders (VAE) on the other hand explicitly maximize a reconstruction-based data log-likelihood forcing it to cover all modes, but suffer from poorer sample quality. Recent works have proposed hybrid VAE-GAN frameworks which integrate a GAN-based synthetic likelihood to the VAE objective to address both the mode collapse and sample quality issues, with limited success. This is because the VAE objective forces a trade-off between the data log-likelihood and divergence to the latent prior. The synthetic likelihood ratio term also shows instability during training. We propose a novel objective with a "Best-of-Many-Samples" reconstruction cost and a stable direct estimate of the synthetic likelihood. This enables our hybrid VAE-GAN framework to achieve high data log-likelihood and low divergence to the latent prior at the same time and shows significant improvement over both hybrid VAE-GANS and plain GANs in mode coverage and quality.