CLApr 16, 2020

Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers

Robert Litschko, Ivan Vulić, Željko Agić, Goran Glavaš

arXiv:2004.07642v131.0991 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of selecting optimal parsers for low-resource languages in NLP, offering an incremental improvement over existing treebank-level methods.

The paper tackles the problem of cross-lingual dependency parser transfer by proposing instance-level parser selection (ILPS), which predicts the best parser for each instance rather than globally, and shows that ILPS outperforms strong baselines on 13-14 out of 20 test languages, with aggregated predictions further improving to 16-17 out of 20 languages.

Current methods of cross-lingual parser transfer focus on predicting the best parser for a low-resource target language globally, that is, "at treebank level". In this work, we propose and argue for a novel cross-lingual transfer paradigm: instance-level parser selection (ILPS), and present a proof-of-concept study focused on instance-level selection in the framework of delexicalized parser transfer. We start from an empirical observation that different source parsers are the best choice for different Universal POS sequences in the target language. We then propose to predict the best parser at the instance level. To this end, we train a supervised regression model, based on the Transformer architecture, to predict parser accuracies for individual POS-sequences. We compare ILPS against two strong single-best parser selection baselines (SBPS): (1) a model that compares POS n-gram distributions between the source and target languages (KL) and (2) a model that selects the source based on the similarity between manually created language vectors encoding syntactic properties of languages (L2V). The results from our extensive evaluation, coupling 42 source parsers and 20 diverse low-resource test languages, show that ILPS outperforms KL and L2V on 13/20 and 14/20 test languages, respectively. Further, we show that by predicting the best parser "at the treebank level" (SBPS), using the aggregation of predictions from our instance-level model, we outperform the same baselines on 17/20 and 16/20 test languages.

View on arXiv PDF

Similar