CLMay 19, 2025

The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation

arXiv:2505.13090v21 citationsh-index: 17EMNLP
Originality Incremental advance
AI Analysis

This resolves prior disparities in research on language diversity for translation tasks, providing insights for optimizing fine-tuning strategies in multilingual NLP.

The study tackled the effect of language diversity in fine-tuning large language models for translation, finding that expanding diversity improves translation quality for both unsupervised and supervised pairs, with benefits plateauing beyond a threshold, and showing that increased diversity creates more language-agnostic representations.

Prior research diverges on language diversity in LLM fine-tuning: Some studies report benefits while others find no advantages. Through controlled fine-tuning experiments across 132 translation directions, we systematically resolve these disparities. We find that expanding language diversity during fine-tuning improves translation quality for both unsupervised and -- surprisingly -- supervised pairs, despite less diverse models being fine-tuned exclusively on these supervised pairs. However, benefits plateau or decrease beyond a certain diversity threshold. We show that increased language diversity creates more language-agnostic representations. These representational adaptations help explain the improved performance in models fine-tuned with greater diversity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes