Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort
This addresses the challenge of reducing human evaluation costs in machine translation for researchers and practitioners, though it is incremental as it applies existing online learning methods to a new application.
The paper tackled the problem of efficiently identifying the best machine translation systems among many by using online learning to dynamically converge to top systems with minimal human feedback, achieving convergence to the top-3 ranked systems on WMT'19 datasets.
In Machine Translation, assessing the quality of a large amount of automatic translations can be challenging. Automatic metrics are not reliable when it comes to high performing systems. In addition, resorting to human evaluators can be expensive, especially when evaluating multiple systems. To overcome the latter challenge, we propose a novel application of online learning that, given an ensemble of Machine Translation systems, dynamically converges to the best systems, by taking advantage of the human feedback available. Our experiments on WMT'19 datasets show that our online approach quickly converges to the top-3 ranked systems for the language pairs considered, despite the lack of human feedback for many translations.