CLDec 21, 2022

Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer?

arXiv:2212.10879v1293 citationsh-index: 70
Originality Synthesis-oriented
AI Analysis

This provides insights into cross-lingual transfer mechanisms for NLP researchers, but it is incremental as it analyzes existing models rather than proposing new methods.

The study investigated how well multilingual BERT (mBERT) captures cross-linguistic syntactic differences across 24 languages and found that the distance between induced grammatical relation distributions aligns with linguistic formalisms, affecting zero-shot transfer performance and being predictable from morphosyntactic properties.

Multilingual BERT (mBERT) has demonstrated considerable cross-lingual syntactic ability, whereby it enables effective zero-shot cross-lingual transfer of syntactic knowledge. The transfer is more successful between some languages, but it is not well understood what leads to this variation and whether it fairly reflects difference between languages. In this work, we investigate the distributions of grammatical relations induced from mBERT in the context of 24 typologically different languages. We demonstrate that the distance between the distributions of different languages is highly consistent with the syntactic difference in terms of linguistic formalisms. Such difference learnt via self-supervision plays a crucial role in the zero-shot transfer performance and can be predicted by variation in morphosyntactic properties between languages. These results suggest that mBERT properly encodes languages in a way consistent with linguistic diversity and provide insights into the mechanism of cross-lingual transfer.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes