CLSep 4, 2021

On the ability of monolingual models to learn language-agnostic representations

arXiv:2109.01942v28 citations
Originality Incremental advance
AI Analysis

This challenges the assumption that multilingual pretraining is necessary for cross-lingual transfer, potentially simplifying model development for language tasks.

The paper tackles the problem of whether monolingual models can learn language-agnostic representations, finding that models pretrained on a single language achieve competitive performance on tasks in different languages, with similar results even for distant languages like German and Portuguese.

Pretrained multilingual models have become a de facto default approach for zero-shot cross-lingual transfer. Previous work has shown that these models are able to achieve cross-lingual representations when pretrained on two or more languages with shared parameters. In this work, we provide evidence that a model can achieve language-agnostic representations even when pretrained on a single language. That is, we find that monolingual models pretrained and finetuned on different languages achieve competitive performance compared to the ones that use the same target language. Surprisingly, the models show a similar performance on a same task regardless of the pretraining language. For example, models pretrained on distant languages such as German and Portuguese perform similarly on English tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes