Unveiling the Influence of Amplifying Language-Specific Neurons
This work addresses the role of language-specific neurons in multilingual behavior, offering insights for low-resource languages but is incremental in scope.
The study investigated the effect of amplifying language-specific neurons in multilingual LLMs across 18 languages, finding that optimal amplification effectively steers output toward target languages but generally degrades cross-language performance on tasks like reasoning and translation.
Language-specific neurons in LLMs that strongly correlate with individual languages have been shown to influence model behavior by deactivating them. However, their role in amplification remains underexplored. This work investigates the effect of amplifying language-specific neurons through interventions across 18 languages, including low-resource ones, using three models primarily trained in different languages. We compare amplification factors by their effectiveness in steering to the target language using a proposed Language Steering Shift (LSS) evaluation score, then evaluate it on downstream tasks: commonsense reasoning (XCOPA, XWinograd), knowledge (Include), and translation (FLORES). The optimal amplification factors effectively steer output toward nearly all tested languages. Intervention using this factor on downstream tasks improves self-language performance in some cases but generally degrades cross-language results. These findings highlight the effect of language-specific neurons in multilingual behavior, where amplification can be beneficial especially for low-resource languages, but provides limited advantage for cross-lingual transfer.