CLAug 11, 2025

Preliminary Ranking of WMT25 General Machine Translation Systems

ETH ZurichMicrosoft
arXiv:2508.14909v23 citationsh-index: 45
Originality Synthesis-oriented
AI Analysis

This provides incremental, temporary guidance for participants in a specific machine translation competition.

The paper presents preliminary rankings of machine translation systems from the WMT25 shared task using automatic metrics, noting potential biases toward re-ranking techniques, with the final rankings to be based on human evaluation.

We present the preliminary rankings of machine translation (MT) systems submitted to the WMT25 General Machine Translation Shared Task, as determined by automatic evaluation metrics. Because these rankings are derived from automatic evaluation, they may exhibit a bias toward systems that employ re-ranking techniques, such as Quality Estimation or Minimum Bayes Risk decoding. The official WMT25 ranking will be based on human evaluation, which is more reliable and will supersede these results. The official WMT25 ranking will be based on human evaluation, which is more reliable and will supersede these results. The purpose of releasing these findings now is to assist task participants with their system description papers; not to provide final findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes