CLAIOct 23, 2025

The Reasoning Lingua Franca: A Double-Edged Sword for Multilingual AI

Microsoft
arXiv:2510.20647v13 citationsh-index: 31
Originality Incremental advance
AI Analysis

This addresses interpretability and cultural nuance issues in AI for multilingual users, but is incremental as it builds on existing model evaluations.

The study investigated multilingual reasoning in Large Reasoning Models, finding that they default to English reasoning for non-English questions, which improves accuracy on complex tasks but introduces translation errors that could be avoided with native-language reasoning.

Large Reasoning Models (LRMs) achieve strong performance on mathematical, scientific, and other question-answering tasks, but their multilingual reasoning abilities remain underexplored. When presented with non-English questions, LRMs often default to reasoning in English, raising concerns about interpretability and the handling of linguistic and cultural nuances. We systematically compare an LRM's reasoning in English versus the language of the question. Our evaluation spans two tasks: MGSM and GPQA Diamond. Beyond measuring answer accuracy, we also analyze cognitive attributes in the reasoning traces. We find that English reasoning traces exhibit a substantially higher presence of these cognitive behaviors, and that reasoning in English generally yields higher final-answer accuracy, with the performance gap increasing as tasks become more complex. However, this English-centric strategy is susceptible to a key failure mode - getting "Lost in Translation," where translation steps lead to errors that would have been avoided by question's language reasoning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes