A Stochastic Decoder for Neural Machine Translation
This work addresses the issue of variation in parallel corpora for machine translation, which is incremental as it builds on existing generative models.
The authors tackled the problem of translation ambiguity in neural machine translation by introducing a deep generative model with latent variables to account for lexical and syntactic variation, resulting in consistent improvements over strong baselines across multiple language pairs.
The process of translation is ambiguous, in that there are typically many valid trans- lations for a given sentence. This gives rise to significant variation in parallel cor- pora, however, most current models of machine translation do not account for this variation, instead treating the prob- lem as a deterministic process. To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to ac- count for local lexical and syntactic varia- tion in parallel corpora. We provide an in- depth analysis of the pitfalls encountered in variational inference for training deep generative models. Experiments on sev- eral different language pairs demonstrate that the model consistently improves over strong baselines.