CLDec 11, 2018

Conditional Variational Autoencoder for Neural Machine Translation

arXiv:1812.04405v161 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of enhancing translation quality in neural machine translation for language processing applications, representing an incremental advancement in applying variational methods to text.

The paper tackles the challenge of using latent variable models for neural machine translation by introducing a conditional variational autoencoder with a co-attention mechanism to mitigate posterior collapse, resulting in improved performance over existing attention-based and variational baselines.

We explore the performance of latent variable models for conditional text generation in the context of neural machine translation (NMT). Similar to Zhang et al., we augment the encoder-decoder NMT paradigm by introducing a continuous latent variable to model features of the translation process. We extend this model with a co-attention mechanism motivated by Parikh et al. in the inference network. Compared to the vision domain, latent variable models for text face additional challenges due to the discrete nature of language, namely posterior collapse. We experiment with different approaches to mitigate this issue. We show that our conditional variational model improves upon both discriminative attention-based translation and the variational baseline presented in Zhang et al. Finally, we present some exploration of the learned latent space to illustrate what the latent variable is capable of capturing. This is the first reported conditional variational model for text that meaningfully utilizes the latent variable without weakening the translation model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes