CLAILGDec 11, 2025

Grow Up and Merge: Scaling Strategies for Efficient Language Adaptation

arXiv:2512.10772v11 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the problem of resource-efficient language adaptation for multilingual NLP, but it is incremental as it builds on existing scaling and merging techniques.

The paper tackles the challenge of adapting pretrained language models to medium- and lower-resource languages by investigating scaling as an efficient strategy, finding that larger upscaled models can match or surpass smaller models with more data, improving data efficiency and reducing catastrophic forgetting in English.

Achieving high-performing language models which include medium- and lower-resource languages remains a challenge. Massively multilingual models still underperform compared to language-specific adaptations, especially at smaller model scales. In this work, we investigate scaling as an efficient strategy for adapting pretrained models to new target languages. Through comprehensive scaling ablations with approximately FLOP-matched models, we test whether upscaling an English base model enables more effective and resource-efficient adaptation than standard continued pretraining. We find that, once exposed to sufficient target-language data, larger upscaled models can match or surpass the performance of smaller models continually pretrained on much more data, demonstrating the benefits of scaling for data efficiency. Scaling also helps preserve the base model's capabilities in English, thus reducing catastrophic forgetting. Finally, we explore whether such scaled, language-specific models can be merged to construct modular and flexible multilingual systems. We find that while merging remains less effective than joint multilingual training, upscaled merges perform better than smaller ones. We observe large performance differences across merging methods, suggesting potential for improvement through merging approaches specialized for language-level integration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes