CLSep 28, 2021

Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations

arXiv:2109.14017v1661 citations
Originality Incremental advance
AI Analysis

This work addresses the fundamental question of how language models process syntax, which is crucial for NLP researchers and practitioners, though it is incremental in building on existing probing methods.

The paper tackled the problem of understanding how Transformer-based language models encode syntactic information by proposing nine probing datasets with controllable text perturbations for English, Swedish, and Russian, finding that syntactic sensitivity varies by language and pre-training objectives, and models rarely use positional information for syntax.

Recent research has adopted a new experimental field centered around the concept of text perturbations which has revealed that shuffled word order has little to no impact on the downstream performance of Transformer-based language models across many NLP tasks. These findings contradict the common understanding of how the models encode hierarchical and structural information and even question if the word order is modeled with position embeddings. To this end, this paper proposes nine probing datasets organized by the type of \emph{controllable} text perturbation for three Indo-European languages with a varying degree of word order flexibility: English, Swedish and Russian. Based on the probing analysis of the M-BERT and M-BART models, we report that the syntactic sensitivity depends on the language and model pre-training objectives. We also find that the sensitivity grows across layers together with the increase of the perturbation granularity. Last but not least, we show that the models barely use the positional information to induce syntactic trees from their intermediate self-attention and contextualized representations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes