Sentences with Gapping: Parsing and Reconstructing Elided Predicates
This addresses a specific parsing challenge for natural language processing tasks like relation extraction, but it is incremental as it builds on existing dependency parsing frameworks.
The paper tackled the problem of parsing sentences with gapping, such as 'Paul likes coffee and Mary tea', by developing two methods to reconstruct elided predicates in Universal Dependencies graphs, achieving high accuracy when gaps are correctly identified and showing applicability to languages like Swedish.
Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments. Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information from sentences with canonical clause structure. In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges. We find that both methods can reconstruct elided material from dependency trees with high accuracy when the parser correctly predicts the existence of a gap. We further demonstrate that one of our methods can be applied to other languages based on a case study on Swedish.