CLAug 18, 2017

Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki's Submission to VarDial 2017

arXiv:1708.05719v11091 citations
Originality Synthesis-oriented
AI Analysis

This work addresses parsing efficiency for linguistically similar languages, but it is incremental as it builds on existing annotation projection and translation methods.

The paper tackled cross-lingual dependency parsing for closely related languages, achieving results where cross-lingual models surpassed fully supervised ones in attachment scores for some languages, with Slovak showing strong performance using Czech data.

This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes