CLNov 15, 2023

Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models

James A. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

MIT

arXiv:2311.09194v121.8134 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses the question of how abstractly language models represent grammar, which is important for understanding their linguistic generalization capabilities, though it is incremental in applying human cognitive methods to models.

The study tackled the problem of assessing the abstractness of grammatical knowledge in large language models by measuring crosslingual structural priming, finding evidence that these models have abstract grammatical representations similar to humans, with results aligning with human experimental data from eight crosslingual and four monolingual experiments.

Abstract grammatical knowledge - of parts of speech and grammatical patterns - is key to the capacity for linguistic generalization in humans. But how abstract is grammatical knowledge in large language models? In the human literature, compelling evidence for grammatical abstraction comes from structural priming. A sentence that shares the same grammatical structure as a preceding sentence is processed and produced more readily. Because confounds exist when using stimuli in a single language, evidence of abstraction is even more compelling from crosslingual structural priming, where use of a syntactic structure in one language primes an analogous structure in another language. We measure crosslingual structural priming in large language models, comparing model behavior to human experimental results from eight crosslingual experiments covering six languages, and four monolingual structural priming experiments in three non-English languages. We find evidence for abstract monolingual and crosslingual grammatical representations in the models that function similarly to those found in humans. These results demonstrate that grammatical representations in multilingual language models are not only similar across languages, but they can causally influence text produced in different languages.

View on arXiv PDF

Similar