LGCVMar 25, 2022

Efficient-VDVAE: Less is more

arXiv:2203.13751v232 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses efficiency and stability issues in hierarchical VAEs for image modeling, enabling more practical use in downstream tasks, though it is incremental in nature.

The authors tackled the instability and high computational demands of hierarchical VAEs by introducing simple modifications to the Very Deep VAE, resulting in up to 2.6x faster convergence, 20x memory savings, and comparable or better negative log-likelihood performance on 7 image datasets. They also found that only about 3% of the latent dimensions are sufficient to encode most image information without performance loss.

Hierarchical VAEs have emerged in recent years as a reliable option for maximum likelihood estimation. However, instability issues and demanding computational requirements have hindered research progress in the area. We present simple modifications to the Very Deep VAE to make it converge up to $2.6\times$ faster, save up to $20\times$ in memory load and improve stability during training. Despite these changes, our models achieve comparable or better negative log-likelihood performance than current state-of-the-art models on all $7$ commonly used image datasets we evaluated on. We also make an argument against using 5-bit benchmarks as a way to measure hierarchical VAE's performance due to undesirable biases caused by the 5-bit quantization. Additionally, we empirically demonstrate that roughly $3\%$ of the hierarchical VAE's latent space dimensions is sufficient to encode most of the image information, without loss of performance, opening up the doors to efficiently leverage the hierarchical VAEs' latent space in downstream tasks. We release our source code and models at https://github.com/Rayhane-mamah/Efficient-VDVAE .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes