Learning to Detect Unacceptable Machine Translations for Downstream Tasks
This work addresses the issue of ensuring translation quality for downstream applications, but it is incremental as it builds on existing machine translation and annotation methods.
The authors tackled the problem of machine translation systems producing unacceptable translations for specific downstream tasks by introducing a framework that uses parallel data to automatically generate acceptability annotations and train task-specific detectors, achieving improved detection performance across various tasks and translation models.
The field of machine translation has progressed tremendously in recent years. Even though the translation quality has improved significantly, current systems are still unable to produce uniformly acceptable machine translations for the variety of possible use cases. In this work, we put machine translation in a cross-lingual pipeline and introduce downstream tasks to define task-specific acceptability of machine translations. This allows us to leverage parallel data to automatically generate acceptability annotations on a large scale, which in turn help to learn acceptability detectors for the downstream tasks. We conduct experiments to demonstrate the effectiveness of our framework for a range of downstream tasks and translation models.