CLDec 17, 2024

Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges

Yangfan Ye, Xiaocheng Feng, Xiachong Feng, Libo Qin, Yichong Huang, Lei Huang, Weitao Ma, Qichen Hong, Zhirui Zhang, Yunfei Lu, Xiaohui Yan, Duyu Tang

arXiv:2412.12686v23.42 citationsh-index: 28Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving multilingual and cultural performance in LLMs for users of low-resource languages, though it appears incremental as it builds on existing knowledge without a major paradigm shift.

The paper tackles the problem of imbalances in multilingual capabilities and cultural adaptability in large language models due to English-centric pre-training by introducing a cross-lingual latent transplantation framework, which empirically shows mutually beneficial effects on these aspects, particularly for low-resource languages and cultures.

Current large language models (LLMs) often exhibit imbalances in multilingual capabilities and cultural adaptability, largely attributed to their English-centric pre-training data. In this paper, we introduce and investigate a cross-lingual latent transplantation (XTransplant) framework, which aims to further exploit the model's internalized multilingual knowledge during inference and examine its effects on the multilingual capability and cultural adaptability of LLMs. XTransplant framework enables models to harness the complementary strengths of both English and non-English resources by transplanting latent activations across languages. Through extensive analysis, we empirically demonstrate that XTransplant, a form of cross-lingual interaction, has mutually beneficial effects on the multilingual capability and cultural adaptability of LLMs, particularly for low-resource languages and cultures. We further reveal that attention modules play a pivotal role in supporting multilingual understanding, while feed-forward modules are more adept at capturing culture-specific knowledge. In addition, we conduct in-depth analysis of XTransplant's stability, effectiveness, and generalizability. By probing the upper bound performance of XTransplant, we expose the considerable underutilization of current LLMs' multilingual potential-a challenge that remains open. We hope our analysis offers a new lens for advancing cross-lingual interactions and better leveraging models' internalized multilingual knowledge.

View on arXiv PDF Code

Similar