Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse
This provides a theoretical explanation for posterior collapse in VAEs, addressing a key issue for researchers in generative modeling, though it is incremental as it builds on existing linear analysis.
The paper tackles the problem of posterior collapse in Variational Autoencoders (VAEs) by analyzing linear VAEs and showing that the ELBO objective does not introduce additional spurious local maxima compared to log marginal likelihood, with training recovering identifiable global maxima corresponding to principal component directions.
Posterior collapse in Variational Autoencoders (VAEs) arises when the variational posterior distribution closely matches the prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We explain how posterior collapse may occur in pPCA due to local maxima in the log marginal likelihood. Unexpectedly, we prove that the ELBO objective for the linear VAE does not introduce additional spurious local maxima relative to log marginal likelihood. We show further that training a linear VAE with exact variational inference recovers an identifiable global maximum corresponding to the principal component directions. Empirically, we find that our linear analysis is predictive even for high-capacity, non-linear VAEs and helps explain the relationship between the observation noise, local maxima, and posterior collapse in deep Gaussian VAEs.