CL AI LGJul 21, 2025

The Impact of Language Mixing on Bilingual LLM Reasoning

Yihao Li, Jiayi Xin, Miranda Muqing Miao, Qi Long, Lyle Ungar

arXiv:2507.15849v215.510 citationsh-index: 20EMNLP

Originality Incremental advance

AI Analysis

This addresses the problem of optimizing reasoning in bilingual LLMs for AI researchers, showing strategic benefits rather than incremental improvements.

The study investigated language mixing in Chinese-English bilingual reasoning models, finding that it enhances reasoning performance, with enforced monolingual decoding reducing accuracy by 5.6 percentage points on MATH500 and a lightweight probe improving accuracy by 2.92 percentage points when guiding decoding.

Proficient multilingual speakers often intentionally switch languages in the middle of a conversation. Similarly, recent reasoning-focused bilingual large language models (LLMs) with strong capabilities in both languages exhibit language mixing-alternating languages within their chain of thought. Discouraging this behavior in DeepSeek-R1 was found to degrade accuracy, suggesting that language mixing may benefit reasoning. In this work, we study language switching in Chinese-English bilingual reasoning models. We identify reinforcement learning with verifiable rewards (RLVR) as the critical training stage that leads to language mixing. We show that language mixing can enhance reasoning: enforcing monolingual decoding reduces accuracy by 5.6 percentage points on MATH500. Additionally, a lightweight probe can be trained to predict whether a potential language switch would benefit or harm reasoning, and when used to guide decoding, increases accuracy by 2.92 percentage points. Our findings suggest that language mixing is not merely a byproduct of multilingual training, but is a strategic reasoning behavior.

View on arXiv PDF

Similar