CLMar 16, 2022

Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning

Miryam de Lhoneux, Sheng Zhang, Anders Søgaard

arXiv:2203.08555v132.0640 citationsh-index: 83Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of parsing truly low-resource languages when source and training languages are unrelated, though it is incremental as it builds on existing multilingual models and curriculum learning methods.

The paper tackled the problem of cross-lingual dependency parsing for low-resource languages by using automated curriculum learning to optimize performance on outlier languages, showing it significantly outperforms uniform and size-proportional sampling in zero-shot settings.

Large multilingual pretrained language models such as mBERT and XLM-RoBERTa have been found to be surprisingly effective for cross-lingual transfer of syntactic parsing models (Wu and Dredze 2019), but only between related languages. However, source and training languages are rarely related, when parsing truly low-resource languages. To close this gap, we adopt a method from multi-task learning, which relies on automated curriculum learning, to dynamically optimize for parsing performance on outlier languages. We show that this approach is significantly better than uniform and size-proportional sampling in the zero-shot setting.

View on arXiv PDF Code

Similar