LGSep 26, 2022

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

arXiv:2209.12590v25 citationsh-index: 76
Originality Incremental advance
AI Analysis

This addresses a key bottleneck in training sequence VAEs for controlled generation and representation learning, offering an incremental improvement over existing dropout methods.

The paper tackles the problem of posterior collapse in sequence variational autoencoders (VAEs), where decoders ignore the latent space, by proposing an adversarial training strategy for targeted dropout. This approach improves both sequence modeling performance and latent space information capture compared to uniform dropout on text benchmarks.

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken the powerful decoder by applying uniformly random dropout to the decoder input. We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes