CL AIJun 11, 2023

Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability

arXiv:2306.06688v18.536 citationsh-index: 39Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of multilingual reasoning transfer for AI researchers, providing empirical insights that are incremental but valuable for model design.

The study investigated whether English-centric models like LLaMA possess multilingual transfer ability for reasoning tasks, finding they can sometimes outperform multilingual pre-trained models like BLOOM, with English being a less suitable source language and scaling reducing source language importance.

Multilingual transfer ability, which reflects how well the models fine-tuned on one source language can be applied to other languages, has been well studied in multilingual pre-trained models (e.g., BLOOM). However, such ability has not been investigated for English-centric models (e.g., LLaMA). To fill this gap, we study the following research questions. First, does multilingual transfer ability exist in English-centric models and how does it compare with multilingual pretrained models? Second, does it only appears when English is the source language for the English-centric model? Third, how does it vary in different tasks? We take multilingual reasoning ability as our focus and conduct extensive experiments across four types of reasoning tasks. We find that the multilingual pretrained model does not always outperform an English-centric model. Furthermore, English appears to be a less suitable source language, and the choice of source language becomes less important when the English-centric model scales up. In addition, different types of tasks exhibit different multilingual transfer abilities. These findings demonstrate that English-centric models not only possess multilingual transfer ability but may even surpass the transferability of multilingual pretrained models if well-trained. By showing the strength and weaknesses, the experiments also provide valuable insights into enhancing multilingual reasoning abilities for the English-centric models.

View on arXiv PDF Code

Similar