LGCLJun 16, 2021

Discrete Auto-regressive Variational Attention Models for Text Modeling

arXiv:2106.08571v13 citations
Originality Incremental advance
AI Analysis

This addresses challenges in text modeling for researchers using VAEs, but it is incremental as it builds on existing VAE frameworks.

The paper tackles information underrepresentation and posterior collapse in variational autoencoders for text modeling by proposing the Discrete Auto-regressive Variational Attention Model (DAVAM), which enriches the latent space and mathematically avoids posterior collapse, showing superiority in language modeling tasks.

Variational autoencoders (VAEs) have been widely applied for text modeling. In practice, however, they are troubled by two challenges: information underrepresentation and posterior collapse. The former arises as only the last hidden state of LSTM encoder is transformed into the latent space, which is generally insufficient to summarize the data. The latter is a long-standing problem during the training of VAEs as the optimization is trapped to a disastrous local optimum. In this paper, we propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges. Specifically, we introduce an auto-regressive variational attention approach to enrich the latent space by effectively capturing the semantic dependency from the input. We further design discrete latent space for the variational attention and mathematically show that our model is free from posterior collapse. Extensive experiments on language modeling tasks demonstrate the superiority of DAVAM against several VAE counterparts.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes