CLMar 16, 2021

Coordinate Constructions in English Enhanced Universal Dependencies: Analysis and Computational Modeling

Stefan Grünewald, Prisca Piccirilli, Annemarie Friedrich

arXiv:2103.08955v132.7800 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a specific issue in natural language processing for linguists and NLP practitioners, offering incremental improvements in dependency parsing accuracy.

The paper tackled the problem of representing coordinate constructions in Enhanced Universal Dependencies for English by creating a manually edited dataset to identify errors and propose propagating adjuncts, and showed that machine-learning methods outperform rule-based ones, with a neural graph-parser achieving better performance than existing pipelines.

In this paper, we address the representation of coordinate constructions in Enhanced Universal Dependencies (UD), where relevant dependency links are propagated from conjunction heads to other conjuncts. English treebanks for enhanced UD have been created from gold basic dependencies using a heuristic rule-based converter, which propagates only core arguments. With the aim of determining which set of links should be propagated from a semantic perspective, we create a large-scale dataset of manually edited syntax graphs. We identify several systematic errors in the original data, and propose to also propagate adjuncts. We observe high inter-annotator agreement for this semantic annotation task. Using our new manually verified dataset, we perform the first principled comparison of rule-based and (partially novel) machine-learning based methods for conjunction propagation for English. We show that learning propagation rules is more effective than hand-designing heuristic rules. When using automatic parses, our neural graph-parser based edge predictor outperforms the currently predominant pipelinesusing a basic-layer tree parser plus converters.

View on arXiv PDF Code

Similar