LGSep 26, 2022

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Đorđe Miladinović, Kumar Shridhar, Kushal Jain, Max B. Paulus, Joachim M. Buhmann, Mrinmaya Sachan, Carl Allen

arXiv:2209.12590v24.65 citationsh-index: 76

Originality Incremental advance

AI Analysis

This addresses a key bottleneck in training sequence VAEs for controlled generation and representation learning, offering an incremental improvement over existing dropout methods.

The paper tackles the problem of posterior collapse in sequence variational autoencoders (VAEs), where decoders ignore the latent space, by proposing an adversarial training strategy for targeted dropout. This approach improves both sequence modeling performance and latent space information capture compared to uniform dropout on text benchmarks.

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken the powerful decoder by applying uniformly random dropout to the decoder input. We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.

View on arXiv PDF

Similar