Generative latent neural models for automatic word alignment
This work addresses word alignment for machine translation and bilingual dictionary learning, but it is incremental as it builds on existing variational autoencoder methods.
The paper tackled the problem of word alignment in parallel sentence pairs by proposing several evolutions of variational autoencoders, achieving competitive results compared to Giza++ and a strong neural network system for two language pairs.
Word alignments identify translational correspondences between words in a parallel sentence pair and are used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems or to perform quality estimation. Variational autoencoders have been recently used in various of natural language processing to learn in an unsupervised way latent representations that are useful for language generation tasks. In this paper, we study these models for the task of word alignment and propose and assess several evolutions of a vanilla variational autoencoders. We demonstrate that these techniques can yield competitive results as compared to Giza++ and to a strong neural network alignment system for two language pairs.