CLJul 30, 2025

Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation

Daniil Gurgurov, Katharina Trinley, Yusser Al Ghussin, Tanja Baeumel, Josef van Genabith, Simon Ostermann

arXiv:2507.22608v310 citationsh-index: 41Has CodeIJCNLP-AACL

Originality Incremental advance

AI Analysis

This work addresses the challenge of interpreting and manipulating multilingual abilities in LLMs for researchers and practitioners, though it is incremental as it builds on existing neuron analysis techniques.

The paper tackled the problem of understanding and controlling language-specific processing in large language models by identifying language-specific neurons across multiple models and languages, and demonstrated that systematic activation manipulation (language arithmetics) can effectively steer model behavior in multilingual tasks, outperforming simpler methods.

Large language models (LLMs) exhibit strong multilingual abilities, yet the neural mechanisms behind language-specific processing remain unclear. We analyze language-specific neurons in Llama-3.1-8B, Mistral-Nemo-12B, and Aya-Expanse-8B & 32B across 21 typologically diverse languages, identifying neurons that control language behavior. Using the Language Activation Probability Entropy (LAPE) method, we show that these neurons cluster in deeper layers, with non-Latin scripts showing greater specialization. Related languages share overlapping neurons, reflecting internal representations of linguistic proximity. Through language arithmetics, i.e. systematic activation addition and multiplication, we steer models to deactivate unwanted languages and activate desired ones, outperforming simpler replacement approaches. These interventions effectively guide behavior across five multilingual tasks: language forcing, translation, QA, comprehension, and NLI. Manipulation is more successful for high-resource languages, while typological similarity improves effectiveness. We also demonstrate that cross-lingual neuron steering enhances downstream performance and reveal internal "fallback" mechanisms for language selection when neurons are progressively deactivated. Our code is made publicly available at https://github.com/d-gurgurov/Language-Neurons-Manipulation.

View on arXiv PDF Code

Similar