CLMar 28, 2020

Variational Transformers for Diverse Response Generation

arXiv:2003.12738v152 citations
Originality Incremental advance
AI Analysis

This addresses the need for more diverse and efficient response generation in dialogue systems, representing an incremental advancement by hybridizing existing methods.

The paper tackled the problem of Transformers being deterministic and inefficient for high-entropy tasks like dialogue response generation by proposing Variational Transformers (VT), which combine Transformers with conditional variational autoencoders to incorporate stochastic latent variables, resulting in improved diversity, semantic relevance, and human judgment on three conversational datasets.

Despite the great promise of Transformers in many sequence modeling tasks (e.g., machine translation), their deterministic nature hinders them from generalizing to high entropy tasks such as dialogue response generation. Previous work proposes to capture the variability of dialogue responses with a recurrent neural network (RNN)-based conditional variational autoencoder (CVAE). However, the autoregressive computation of the RNN limits the training efficiency. Therefore, we propose the Variational Transformer (VT), a variational self-attentive feed-forward sequence model. The VT combines the parallelizability and global receptive field of the Transformer with the variational nature of the CVAE by incorporating stochastic latent variables into Transformers. We explore two types of the VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of fine-grained latent variables. Then, the proposed models are evaluated on three conversational datasets with both automatic metric and human evaluation. The experimental results show that our models improve standard Transformers and other baselines in terms of diversity, semantic relevance, and human judgment.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes