CLApr 30, 2020

MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer

arXiv:2005.00052v31135 citations
AI Analysis

This work addresses the challenge of enabling NLP applications in low-resource languages through more efficient and portable cross-lingual transfer, representing an incremental improvement with novel architectural components.

The paper tackles the problem of weak cross-lingual transfer performance in low-resource languages with pre-trained multilingual models by proposing MAD-X, an adapter-based framework that learns modular language and task representations, achieving state-of-the-art results on named entity recognition and causal commonsense reasoning across diverse languages.

The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer. However, due to limited model capacity, their transfer performance is the weakest exactly on such low-resource languages and languages unseen during pre-training. We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations. In addition, we introduce a novel invertible adapter architecture and a strong baseline method for adapting a pre-trained multilingual model to a new language. MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and causal commonsense reasoning, and achieves competitive results on question answering. Our code and adapters are available at AdapterHub.ml

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes