CL SIJun 13, 2025

A Gamified Evaluation and Recruitment Platform for Low Resource Language Machine Translation Systems

arXiv:2506.11467v1h-index: 1

Originality Synthesis-oriented

AI Analysis

This addresses the challenge for developers of machine translation systems in low-resource language contexts, though it is incremental as it builds on existing evaluation procedures with a design proposal.

The paper tackles the problem of evaluating machine translation systems for low-resource languages by proposing a gamified platform to recruit human evaluators and generate datasets, addressing the shortage of both resources.

Human evaluators provide necessary contributions in evaluating large language models. In the context of Machine Translation (MT) systems for low-resource languages (LRLs), this is made even more apparent since popular automated metrics tend to be string-based, and therefore do not provide a full picture of the nuances of the behavior of the system. Human evaluators, when equipped with the necessary expertise of the language, will be able to test for adequacy, fluency, and other important metrics. However, the low resource nature of the language means that both datasets and evaluators are in short supply. This presents the following conundrum: How can developers of MT systems for these LRLs find adequate human evaluators and datasets? This paper first presents a comprehensive review of existing evaluation procedures, with the objective of producing a design proposal for a platform that addresses the resource gap in terms of datasets and evaluators in developing MT systems. The result is a design for a recruitment and gamified evaluation platform for developers of MT systems. Challenges are also discussed in terms of evaluating this platform, as well as its possible applications in the wider scope of Natural Language Processing (NLP) research.

View on arXiv PDF

Similar