CLOct 12, 2019

Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations

arXiv:1910.05479v11021 citations
Originality Incremental advance
AI Analysis

This work addresses dependency parsing for low-resource languages without translation data, though it reveals limitations in universality.

The paper tackled the problem of unsupervised universal dependency parsing using multilingual BERT, achieving state-of-the-art results in six low-resource languages at CoNLL 2018, but found variability in accuracy and failures in some languages.

We investigate whether off-the-shelf deep bidirectional sentence representations trained on a massively multilingual corpus (multilingual BERT) enable the development of an unsupervised universal dependency parser. This approach only leverages a mix of monolingual corpora in many languages and does not require any translation data making it applicable to low-resource languages. In our experiments we outperform the best CoNLL 2018 language-specific systems in all of the shared task's six truly low-resource languages while using a single system. However, we also find that (i) parsing accuracy still varies dramatically when changing the training languages and (ii) in some target languages zero-shot transfer fails under all tested conditions, raising concerns on the 'universality' of the whole approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes