CLSep 6, 2018

82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models

arXiv:1809.02237v11110 citations
Originality Synthesis-oriented
AI Analysis

This work addresses dependency parsing efficiency and accuracy for multiple languages, but it is incremental as it builds on existing pipeline methods.

The authors tackled universal dependency parsing by training multi-treebank models for languages or related language groups, reducing the number of models needed. Their system ranked 7th out of 27 teams in LAS and MLAS metrics and achieved the best scores in word segmentation, universal POS tagging, and morphological features.

We present the Uppsala system for the CoNLL 2018 Shared Task on universal dependency parsing. Our system is a pipeline consisting of three components: the first performs joint word and sentence segmentation; the second predicts part-of- speech tags and morphological features; the third predicts dependency trees from words and tags. Instead of training a single parsing model for each treebank, we trained models with multiple treebanks for one language or closely related languages, greatly reducing the number of models. On the official test run, we ranked 7th of 27 teams for the LAS and MLAS metrics. Our system obtained the best scores overall for word segmentation, universal POS tagging, and morphological features.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes