CLMay 28, 2020

Variational Neural Machine Translation with Normalizing Flows

Hendra Setiawan, Matthias Sperber, Udhay Nallasamy, Matthias Paulik

arXiv:2005.13978v131.1999 citations

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in neural machine translation for researchers and practitioners by enhancing latent variable modeling, though it is incremental as it builds on existing VNMT and Transformer methods.

The paper tackled the challenge of learning informative latent variables in Variational Neural Machine Translation (VNMT) by applying the framework to the Transformer and introducing a more flexible approximate posterior based on normalizing flows, resulting in significant performance improvements under both in-domain and out-of-domain conditions.

Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the latent space can be prohibitively large, and the latent codes are prone to be ignored by many translation models at training time. Previous works impose strong assumptions on the distribution of the latent code and limit the choice of the NMT architecture. In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. We demonstrate the efficacy of our proposal under both in-domain and out-of-domain conditions, significantly outperforming strong baselines.

View on arXiv PDF

Similar