CLMay 5, 2020

It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information

arXiv:2005.02354v21005 citationsHas Code
Originality Incremental advance
AI Analysis

This provides a tool for researchers and practitioners to systematically assess translation direction difficulties, though it is incremental as it builds on existing information-theoretic concepts.

The authors tackled the problem of evaluating translation difficulty in neural machine translation by proposing cross-mutual information (XMI), an asymmetric metric that measures difficulty while controlling for target-side generation, and found that translating into English is harder than out of it.

The performance of neural machine translation systems is commonly evaluated in terms of BLEU. However, due to its reliance on target language properties and generation, the BLEU metric does not allow an assessment of which translation directions are more difficult to model. In this paper, we propose cross-mutual information (XMI): an asymmetric information-theoretic metric of machine translation difficulty that exploits the probabilistic nature of most neural machine translation models. XMI allows us to better evaluate the difficulty of translating text into the target language while controlling for the difficulty of the target-side generation component independent of the translation task. We then present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems. Code for replicating our experiments is available online at https://github.com/e-bug/nmt-difficulty.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes