Improving Multilingual Language Models by Aligning Representations through Steering
This work addresses the underexplored issue of multilingual representation in LLMs, offering an efficient method for improving their capabilities across languages.
The paper tackles the problem of how Large Language Models represent non-English tokens by proposing a lightweight intervention method using representation steering, which consistently outperforms most alternatives across seven competitive baselines and achieves performance comparable to production-grade translation systems with far fewer resources.
This paper investigates how Large Language Models (LLMs) represent non-English tokens -- a question that remains underexplored despite recent progress. We propose a lightweight intervention method using representation steering, where a learned vector is added to the residual stream at a single model layer to enhance multilingual performance. Through extensive experiments across seven competitive baselines -- including prompt optimization, supervised fine-tuning (SFT), in-context learning, cross-lingual transfer, and translation-based methods-we show that our approach consistently outperforms most alternatives. In particular, it achieves performance on par with production-grade translation systems while requiring far fewer resources. We further explore the complementarity between our method and SFT, demonstrating that steering offers a direct, efficient way to realign internal representations. These findings underscore the potential of activation-level interventions as a powerful tool for improving the multilingual capabilities of LLMs.