CLApr 7, 2020

Machine Translation with Unsupervised Length-Constraints

arXiv:2004.03176v131.11004 citations

Originality Incremental advance

AI Analysis

This work addresses the need for length-constrained translations in applications like formatting, but it is incremental as it builds on existing encoder-decoder architectures.

The paper tackles the problem of generating machine translations that meet specific length constraints, which is essential for display formatting, by proposing an end-to-end approach that learns text compression unsupervised and integrates constraints into the model, resulting in significantly improved translation quality under constraints and enabling unsupervised monolingual sentence compression.

We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this paper, we explore one of these, the generation of constraint translation. We focus on length constraints, which are essential if the translation should be displayed in a given format. In this work, we propose an end-to-end approach for this task. Compared to a traditional method that first translates and then performs sentence compression, the text compression is learned completely unsupervised. By combining the idea with zero-shot multilingual machine translation, we are also able to perform unsupervised monolingual sentence compression. In order to fulfill the length constraints, we investigated several methods to integrate the constraints into the model. Using the presented technique, we are able to significantly improve the translation quality under constraints. Furthermore, we are able to perform unsupervised monolingual sentence compression.

View on arXiv PDF

Similar