All Languages Matter: Understanding and Mitigating Language Bias in Multilingual RAG
This addresses a critical problem for multilingual AI applications by reducing bias in cross-lingual evidence retrieval, though it is incremental as it builds on existing mRAG frameworks.
The paper tackles language bias in multilingual Retrieval-Augmented Generation (mRAG) systems, where rerankers favor English and the query's native language, limiting performance; by proposing LAURA, a language-agnostic utility-driven reranker alignment method, it mitigates this bias and consistently improves mRAG performance across diverse languages and models.
Multilingual Retrieval-Augmented Generation (mRAG) leverages cross-lingual evidence to ground Large Language Models (LLMs) in global knowledge. However, we show that current mRAG systems suffer from a language bias during reranking, systematically favoring English and the query's native language. By introducing an estimated oracle evidence analysis, we quantify a substantial performance gap between existing rerankers and the achievable upper bound. Further analysis reveals a critical distributional mismatch: while optimal predictions require evidence scattered across multiple languages, current systems systematically suppress such ``answer-critical'' documents, thereby limiting downstream generation performance. To bridge this gap, we propose \textit{\textbf{L}anguage-\textbf{A}gnostic \textbf{U}tility-driven \textbf{R}eranker \textbf{A}lignment (LAURA)}, which aligns multilingual evidence ranking with downstream generative utility. Experiments across diverse languages and generation models show that LAURA effectively mitigates language bias and consistently improves mRAG performance.