CLAISep 5, 2025

No Translation Needed: Forecasting Quality from Fertility and Metadata

arXiv:2509.05425v1h-index: 4
Originality Synthesis-oriented
AI Analysis

This provides a method for multilingual evaluation and quality estimation, but it is incremental as it applies existing techniques to a new task.

The paper tackled the problem of predicting translation quality without running the translation system, using features like token fertility and linguistic metadata, achieving R² scores of 0.66 and 0.72 for different translation directions on the FLORES-200 benchmark.

We show that translation quality can be predicted with surprising accuracy \textit{without ever running the translation system itself}. Using only a handful of features, token fertility ratios, token counts, and basic linguistic metadata (language family, script, and region), we can forecast ChrF scores for GPT-4o translations across 203 languages in the FLORES-200 benchmark. Gradient boosting models achieve favorable performance ($R^{2}=0.66$ for XX$\rightarrow$English and $R^{2}=0.72$ for English$\rightarrow$XX). Feature importance analyses reveal that typological factors dominate predictions into English, while fertility plays a larger role for translations into diverse target languages. These findings suggest that translation quality is shaped by both token-level fertility and broader linguistic typology, offering new insights for multilingual evaluation and quality estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes