CLAIAug 23, 2025

Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages

arXiv:2508.17078v23 citationsh-index: 2Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses the problem of data-efficient cross-lingual transfer for low-resource languages, offering an incremental improvement over existing methods.

The paper tackles the challenge of improving large language model performance on low-resource languages by proposing BridgeX-ICL, a method that uses shared neurons to enhance zero-shot cross-lingual in-context learning, validated on 4 tasks and 15 language pairs.

The current Large Language Models (LLMs) face significant challenges in improving their performance on low-resource languages and urgently need data-efficient methods without costly fine-tuning. From the perspective of language-bridge, we propose a simple yet effective method, namely BridgeX-ICL, to improve the zero-shot Cross-lingual In-Context Learning (X-ICL) for low-resource languages. Unlike existing works focusing on language-specific neurons, BridgeX-ICL explores whether sharing neurons can improve cross-lingual performance in LLMs. We construct neuron probe data from the ground-truth MUSE bilingual dictionaries, and define a subset of language overlap neurons accordingly to ensure full activation of these anchored neurons. Subsequently, we propose an HSIC-based metric to quantify LLMs' internal linguistic spectrum based on overlapping neurons, guiding optimal bridge selection. The experiments conducted on 4 cross-lingual tasks and 15 language pairs from 7 diverse families, covering both high-low and moderate-low pairs, validate the effectiveness of BridgeX-ICL and offer empirical insights into the underlying multilingual mechanisms of LLMs. The code is publicly available at https://github.com/xuyuemei/BridgeX-ICL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes