CLFeb 18, 2025

CoCo-CoLa: Evaluating and Improving Language Adherence in Multilingual LLMs

arXiv:2502.12476v23 citationsh-index: 5Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Originality Incremental advance
AI Analysis

This addresses language bias in multilingual AI systems, benefiting users of low-resource languages, but is incremental as it builds on existing fine-tuning methods.

The paper tackles the problem of multilingual LLMs often generating responses in unintended high-resource languages like English, and introduces CoCo-CoLa to evaluate language adherence, finding that partial fine-tuning of key layers improves adherence with reduced computational cost, achieving comparable or superior performance to full fine-tuning, especially for low-resource languages.

Multilingual Large Language Models (LLMs) develop cross-lingual abilities despite being trained on limited parallel data. However, they often struggle to generate responses in the intended language, favoring high-resource languages such as English. In this work, we introduce CoCo-CoLa (Correct Concept - Correct Language), a novel metric to evaluate language adherence in multilingual LLMs. Using fine-tuning experiments on a closed-book QA task across seven languages, we analyze how training in one language affects others' performance. Our findings reveal that multilingual models share task knowledge across languages but exhibit biases in the selection of output language. We identify language-specific layers, showing that final layers play a crucial role in determining output language. Accordingly, we propose a partial training strategy that selectively fine-tunes key layers, improving language adherence while significantly reducing computational cost. Our method achieves comparable or superior performance to full fine-tuning, particularly for low-resource languages, offering a more efficient multilingual adaptation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes