CLJun 22, 2021

On the Evaluation of Machine Translation for Terminology Consistency

Md Mahfuz ibn Alam, Antonios Anastasopoulos, Laurent Besacier, James Cross, Matthias Gallé, Philipp Koehn, Vassilina Nikoulina

arXiv:2106.11891v22.637 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for reliable evaluation tools in professional translation pipelines, particularly for domain adaptation, though it is incremental as it builds on existing terminology integration efforts.

The authors tackled the problem of evaluating machine translation systems for adherence to domain-specific terminologies, proposing new metrics and validating them through studies on the COVID-19 domain across 5 languages, including human evaluation.

As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies. In many scenarios and particularly in cases of domain adaptation, one expects the MT output to adhere to the constraints provided by a terminology. In this work, we propose metrics to measure the consistency of MT output with regards to a domain terminology. We perform studies on the COVID-19 domain over 5 languages, also performing terminology-targeted human evaluation. We open-source the code for computing all proposed metrics: https://github.com/mahfuzibnalam/terminology_evaluation

View on arXiv PDF Code

Similar