Revisiting Round-Trip Translation for Quality Estimation
This work addresses the problem of evaluating translation quality without references for users of machine translation systems, but it is incremental as it builds on existing round-trip translation methods.
The paper tackled improving quality estimation for machine translation by using semantic embeddings with round-trip translation, achieving the highest correlations with human judgments compared to previous submissions in the WMT 2019 task.
Quality estimation (QE) is the task of automatically evaluating the quality of translations without human-translated references. Calculating BLEU between the input sentence and round-trip translation (RTT) was once considered as a metric for QE, however, it was found to be a poor predictor of translation quality. Recently, various pre-trained language models have made breakthroughs in NLP tasks by providing semantically meaningful word and sentence embeddings. In this paper, we employ semantic embeddings to RTT-based QE. Our method achieves the highest correlations with human judgments, compared to previous WMT 2019 quality estimation metric task submissions. While backward translation models can be a drawback when using RTT, we observe that with semantic-level metrics, RTT-based QE is robust to the choice of the backward translation system. Additionally, the proposed method shows consistent performance for both SMT and NMT forward translation systems, implying the method does not penalize a certain type of model.