CLAIJun 16, 2025

Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models

arXiv:2506.13044v15 citationsh-index: 4ACL
Originality Synthesis-oriented
AI Analysis

This addresses the problem of enhancing multilingual performance in large language models for translation and reasoning tasks, though it appears incremental as it builds on existing debates about parallel data utility.

The paper systematically studied how adding parallel data affects large language models' multilingual capabilities, specifically for translation and multilingual common-sense reasoning, and demonstrated that parallel data can significantly improve these capabilities.

Large language models (LLMs) have demonstrated impressive translation capabilities even without being explicitly trained on parallel data. This remarkable property has led some to believe that parallel data is no longer necessary for building multilingual language models. While some attribute this to the emergent abilities of LLMs due to scale, recent work suggests that it is actually caused by incidental bilingual signals present in the training data. Various methods have been proposed to maximize the utility of parallel data to enhance the multilingual capabilities of multilingual encoder-based and encoder-decoder language models. However, some decoder-based LLMs opt to ignore parallel data instead. In this work, we conduct a systematic study on the impact of adding parallel data on LLMs' multilingual capabilities, focusing specifically on translation and multilingual common-sense reasoning. Through controlled experiments, we demonstrate that parallel data can significantly improve LLMs' multilingual capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes