Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology
It addresses the problem of improving transfer to low-resource languages for NLP practitioners, but is incremental as it reviews and analyzes existing findings without introducing new methods.
This paper investigates how language family proximity and morphological similarity affect cross-lingual transfer performance in multilingual NLP, comparing model results and correlating linguistic distance metrics with outcomes.
Cross-lingual transfer has become a crucial aspect of multilingual NLP, as it allows for models trained on resource-rich languages to be applied to low-resource languages more effectively. Recently massively multilingual pre-trained language models (e.g., mBERT, XLM-R) demonstrate strong zero-shot transfer capabilities[14] [13]. This paper investigates cross-linguistic transfer through the lens of language families and morphology. Investigating how language family proximity and morphological similarity affect performance across NLP tasks. We further discuss our results and how it relates to findings from recent literature. Overall, we compare multilingual model performance and review how linguistic distance metrics correlate with transfer outcomes. We also look into emerging approaches that integrate typological and morphological information into model pre-training to improve transfer to diverse languages[18] [19].