CLOct 11, 2023

Crosslingual Structural Priming and the Pre-Training Dynamics of Bilingual Language Models

MIT
arXiv:2310.07929v11 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the problem of understanding representation development in bilingual models for researchers in computational linguistics, though it is incremental as it extends prior work to a new language setting.

The study investigated whether multilingual language models develop shared abstract grammatical representations across languages and found that crosslingual structural priming effects emerge early, with less than 1 million tokens of data in the second language.

Do multilingual language models share abstract grammatical representations across languages, and if so, when do these develop? Following Sinclair et al. (2022), we use structural priming to test for abstract grammatical representations with causal effects on model outputs. We extend the approach to a Dutch-English bilingual setting, and we evaluate a Dutch-English language model during pre-training. We find that crosslingual structural priming effects emerge early after exposure to the second language, with less than 1M tokens of data in that language. We discuss implications for data contamination, low-resource transfer, and how abstract grammatical representations emerge in multilingual models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes