On Posterior Collapse and Encoder Feature Dispersion in Sequence VAEs
This addresses a key limitation in text modeling with VAEs, offering a computationally efficient solution for researchers and practitioners, though it is incremental as it builds on known issues.
The paper tackles posterior collapse in sequence VAEs with autoregressive decoders, where models ignore latent variables, by identifying lack of encoder feature dispersion as a cause and proposing a pooling-based fix that improves data log-likelihood significantly compared to standard methods.
Variational autoencoders (VAEs) hold great potential for modelling text, as they could in theory separate high-level semantic and syntactic properties from local regularities of natural language. Practically, however, VAEs with autoregressive decoders often suffer from posterior collapse, a phenomenon where the model learns to ignore the latent variables, causing the sequence VAE to degenerate into a language model. In this paper, we argue that posterior collapse is in part caused by the lack of dispersion in encoder features. We provide empirical evidence to verify this hypothesis, and propose a straightforward fix using pooling. This simple technique effectively prevents posterior collapse, allowing model to achieve significantly better data log-likelihood than standard sequence VAEs. Comparing to existing work, our proposed method is able to achieve comparable or superior performances while being more computationally efficient.