CLMay 24, 2022

Word-order typology in Multilingual BERT: A case study in subordinate-clause detection

arXiv:2205.11987v1628 citationsh-index: 34
Originality Synthesis-oriented
AI Analysis

This addresses the problem of understanding BERT's cross-linguistic syntactic capabilities for NLP researchers, but it is incremental as it focuses on a specific case study.

The study investigated BERT's ability to learn syntactic abstractions across languages using subordinate-clause detection, revealing that zero-shot performance is heavily influenced by word-order typology, with easy gains offset by harder cases.

The capabilities and limitations of BERT and similar models are still unclear when it comes to learning syntactic abstractions, in particular across languages. In this paper, we use the task of subordinate-clause detection within and across languages to probe these properties. We show that this task is deceptively simple, with easy gains offset by a long tail of harder cases, and that BERT's zero-shot performance is dominated by word-order effects, mirroring the SVO/VSO/SOV typology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes