CLJul 5, 2021

The DCU-EPFL Enhanced Dependency Parser at the IWPT 2021 Shared Task

arXiv:2107.01982v1711 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of parsing enhanced semantic structures for natural language processing researchers, but it is incremental as it builds on existing methods and benchmarks.

The paper tackled parsing Enhanced Universal Dependencies graphs from raw text across multiple languages, achieving a coarse Enhanced Labeled Attachment Score of 83.57 in the shared task and improving it to 88.04 with post-deadline modifications.

We describe the DCU-EPFL submission to the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies. The task involves parsing Enhanced UD graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure. Evaluation is carried out on 29 treebanks in 17 languages and participants are required to parse the data from each language starting from raw strings. Our approach uses the Stanza pipeline to preprocess the text files, XLMRoBERTa to obtain contextualized token representations, and an edge-scoring and labeling model to predict the enhanced graph. Finally, we run a post-processing script to ensure all of our outputs are valid Enhanced UD graphs. Our system places 6th out of 9 participants with a coarse Enhanced Labeled Attachment Score (ELAS) of 83.57. We carry out additional post-deadline experiments which include using Trankit for pre-processing, XLM-RoBERTa-LARGE, treebank concatenation, and multitask learning between a basic and an enhanced dependency parser. All of these modifications improve our initial score and our final system has a coarse ELAS of 88.04.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes