CL CYJun 15, 2024

Multilingual Large Language Models and Curse of Multilinguality

Daniil Gurgurov, Tanja Bäumel, Tatiana Anikina

arXiv:2406.10602v29.117 citations

Originality Synthesis-oriented

AI Analysis

It offers a foundational resource for NLP researchers and practitioners by surveying existing models and challenges, but it is incremental as it primarily reviews and synthesizes current knowledge without introducing new methods or results.

This paper provides an introductory overview of multilingual large language models, explaining their technical aspects and addressing the curse of multilinguality as a significant limitation.

Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks. This paper navigates the landscape of multilingual LLMs, providing an introductory overview of their technical aspects. It explains underlying architectures, objective functions, pre-training data sources, and tokenization methods. This work explores the unique features of different model types: encoder-only (mBERT, XLM-R), decoder-only (XGLM, PALM, BLOOM, GPT-3), and encoder-decoder models (mT5, mBART). Additionally, it addresses one of the significant limitations of multilingual LLMs - the curse of multilinguality - and discusses current attempts to overcome it.

View on arXiv PDF

Similar