LGCVOct 18, 2022

Optimizing Hierarchical Image VAEs for Sample Quality

arXiv:2210.10205v15 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This work addresses a specific issue in generative modeling for images, offering incremental improvements to existing hierarchical VAE methods.

The paper tackled the problem of poor sample quality in hierarchical variational autoencoders (VAEs) for image modeling by introducing KL-reweighting, a Gaussian output layer, and classifier-free guidance, resulting in improved fidelity and diversity in generated images.

While hierarchical variational autoencoders (VAEs) have achieved great density estimation on image modeling tasks, samples from their prior tend to look less convincing than models with similar log-likelihood. We attribute this to learned representations that over-emphasize compressing imperceptible details of the image. To address this, we introduce a KL-reweighting strategy to control the amount of infor mation in each latent group, and employ a Gaussian output layer to reduce sharpness in the learning objective. To trade off image diversity for fidelity, we additionally introduce a classifier-free guidance strategy for hierarchical VAEs. We demonstrate the effectiveness of these techniques in our experiments. Code is available at https://github.com/tcl9876/visual-vae.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes