CLAug 30, 2018

Pronoun Translation in English-French Machine Translation: An Analysis of Error Types

arXiv:1808.10196v19 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a long-standing challenge in machine translation for linguists and developers, but it is incremental as it focuses on error analysis without introducing new methods.

The study analyzed pronoun translation errors in English-French machine translation using the PROTEST test suite, finding that rule-based systems performed poorly due to oversimplification, while SMT and early NMT systems had issues with pronoun properties, and a recent Transformer-based NMT system showed promising results but still struggled with cross-sentence dependencies.

Pronouns are a long-standing challenge in machine translation. We present a study of the performance of a range of rule-based, statistical and neural MT systems on pronoun translation based on an extensive manual evaluation using the PROTEST test suite, which enables a fine-grained analysis of different pronoun types and sheds light on the difficulties of the task. We find that the rule-based approaches in our corpus perform poorly as a result of oversimplification, whereas SMT and early NMT systems exhibit significant shortcomings due to a lack of awareness of the functional and referential properties of pronouns. A recent Transformer-based NMT system with cross-sentence context shows very promising results on non-anaphoric pronouns and intra-sentential anaphora, but there is still considerable room for improvement in examples with cross-sentence dependencies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes