CLJan 27, 2022

Systematic Investigation of Strategies Tailored for Low-Resource Settings for Low-Resource Dependency Parsing

Jivnesh Sandhan, Laxmidhar Behera, Pawan Goyal

arXiv:2201.11374v223.1267 citationsh-index: 39Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of optimizing parsing performance for low-resource languages, which is incremental as it systematically evaluates known strategies rather than introducing new ones.

The paper tackles the problem of selecting effective strategies for low-resource dependency parsing across multiple languages, showing improvements for languages not covered by pretrained models, with a successful application to Sanskrit.

In this work, we focus on low-resource dependency parsing for multiple languages. Several strategies are tailored to enhance performance in low-resource scenarios. While these are well-known to the community, it is not trivial to select the best-performing combination of these strategies for a low-resource language that we are interested in, and not much attention has been given to measuring the efficacy of these strategies. We experiment with 5 low-resource strategies for our ensembled approach on 7 Universal Dependency (UD) low-resource languages. Our exhaustive experimentation on these languages supports the effective improvements for languages not covered in pretrained models. We show a successful application of the ensembled system on a truly low-resource language Sanskrit. The code and data are available at: https://github.com/Jivnesh/SanDP

View on arXiv PDF Code

Similar