CLSep 28, 2022

An Automatic Evaluation of the WMT22 General Machine Translation Task

arXiv:2209.14172v20.64 citationsh-index: 8

Originality Synthesis-oriented

AI Analysis

This provides a large-scale benchmark for machine translation researchers, but it is incremental as it applies existing methods to new data.

The report conducted an automatic evaluation of 185 systems across 21 translation directions in the WMT22 general machine translation task, revealing current limits of state-of-the-art systems and demonstrating how metrics like chrF, BLEU, and COMET can complement each other to improve interpretability and accuracy.

This report presents an automatic evaluation of the general machine translation task of the Seventh Conference on Machine Translation (WMT22). It evaluates a total of 185 systems for 21 translation directions including high-resource to low-resource language pairs and from closely related to distant languages. This large-scale automatic evaluation highlights some of the current limits of state-of-the-art machine translation systems. It also shows how automatic metrics, namely chrF, BLEU, and COMET, can complement themselves to mitigate their own limits in terms of interpretability and accuracy.

View on arXiv PDF

Similar