MLCVLGApr 3, 2018

Training VAEs Under Structured Residuals

arXiv:1804.01050v313 citations
Originality Incremental advance
AI Analysis

This addresses a limitation in VAE modeling for image generation, offering improved residual handling, though it appears incremental in scope.

The paper tackles the problem of VAEs assuming independent pixel uncertainties by proposing a novel scheme to incorporate structured Gaussian likelihood prediction, allowing residual correlations to be modeled with minimal complexity increase.

Variational auto-encoders (VAEs) are a popular and powerful deep generative model. Previous works on VAEs have assumed a factorized likelihood model, whereby the output uncertainty of each pixel is assumed to be independent. This approximation is clearly limited as demonstrated by observing a residual image from a VAE reconstruction, which often possess a high level of structure. This paper demonstrates a novel scheme to incorporate a structured Gaussian likelihood prediction network within the VAE that allows the residual correlations to be modeled. Our novel architecture, with minimal increase in complexity, incorporates the covariance matrix prediction within the VAE. We also propose a new mechanism for allowing structured uncertainty on color images. Furthermore, we provide a scheme for effectively training this model, and include some suggestions for improving performance in terms of efficiency or modeling longer range correlations.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes