CLAIJan 29

Indic-TunedLens: Interpreting Multilingual Models in Indian Languages

arXiv:2602.15038v2h-index: 8Has Code
AI Analysis

This addresses the pressing concern of cross-lingual interpretability for multilingual models in linguistically diverse regions like India, providing crucial insights into semantic encoding, though it is incremental as it builds on existing interpretability tools.

The authors tackled the problem of interpretability for multilingual large language models in Indian languages, introducing Indic-TunedLens, which significantly improves over state-of-the-art interpretability methods on the MMLU benchmark across 10 languages, especially for morphologically rich, low-resource ones.

Multilingual large language models (LLMs) are increasingly deployed in linguistically diverse regions like India, yet most interpretability tools remain tailored to English. Prior work reveals that LLMs often operate in English centric representation spaces, making cross lingual interpretability a pressing concern. We introduce Indic-TunedLens, a novel interpretability framework specifically for Indian languages that learns shared affine transformations. Unlike the standard Logit Lens, which directly decodes intermediate activations, Indic-TunedLens adjusts hidden states for each target language, aligning them with the target output distributions to enable more faithful decoding of model representations. We evaluate our framework on 10 Indian languages using the MMLU benchmark and find that it significantly improves over SOTA interpretability methods, especially for morphologically rich, low resource languages. Our results provide crucial insights into the layer-wise semantic encoding of multilingual transformers. Our model is available at https://huggingface.co/spaces/MihirRajeshPanchal/IndicTunedLens. Our code is available at https://github.com/MihirRajeshPanchal/IndicTunedLens.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes