Typologically Informed Parameter Aggregation
This provides a training-free solution for improving cross-lingual generalization in NLP, particularly benefiting low-resource languages, though it is incremental as it builds on existing adapter-based methods.
The paper tackles the problem of underperformance in low-resource and unseen languages by massively multilingual models, introducing Typologically Informed Parameter Aggregation (TIPA) to construct proxy language adapters without training, which consistently outperforms or matches baselines across five NLP tasks and over 230 languages, with the largest gains for languages lacking dedicated adapters.
Massively multilingual language models enable cross-lingual generalization but underperform on low-resource and unseen languages. While adapter-based fine-tuning offers a parameter-efficient solution, training language-specific adapters at scale remains costly. We introduce Typologically Informed Parameter Aggregation (TIPA), a training-free method that constructs proxy language adapters by aggregating existing ones, weighted by typological similarity. Integrated into the MAD-X framework, these proxies enable zero-shot cross-lingual transfer without additional training. We evaluate TIPA on five NLP tasks and over 230 languages. TIPA consistently outperforms or matches baselines such as English-only fine-tuning or selecting the typologically closest language adapter. We see the largest gains for languages lacking dedicated adapters. Our results demonstrate that typologically informed aggregation provides a viable alternative to language-specific modules without any training needed.