CLSep 3, 2025

Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader

Jannis Vamvas, Ignacio Pérez Prat, Not Battesta Soliva, Sandra Baltermia-Guetg, Andrina Beeli, Simona Beeli, Madlaina Capeder, Laura Decurtins, Gian Peder Gregori, Flavia Hobi, Gabriela Holderegger, Arina Lazzarini

arXiv:2509.03148v29.64 citationsh-index: 4Proceedings of the Tenth Conference on Machine Translation

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of limited evaluation data for low-resource languages like Romansh, benefiting NLP researchers and communities, though it is incremental as it extends an existing benchmark.

The paper tackles the lack of machine translation evaluation resources for Romansh by creating a benchmark for six of its varieties, integrated with the WMT24++ framework, and finds that translation into Romansh remains challenging while out-of-Romansh translation performs relatively well.

The Romansh language, spoken in Switzerland, has limited resources for machine translation evaluation. In this paper, we present a benchmark for six varieties of Romansh: Rumantsch Grischun, a supra-regional variety, and five regional varieties: Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader. Our reference translations were created by human translators based on the WMT24++ benchmark, which ensures parallelism with more than 55 other languages. An automatic evaluation of existing MT systems and LLMs shows that translation out of Romansh into German is handled relatively well for all the varieties, but translation into Romansh is still challenging.

View on arXiv PDF

Similar