A Transition-Based Directed Acyclic Graph Parser for UCCA
This work addresses the problem of parsing complex semantic DAG structures for natural language processing researchers, particularly in cross-linguistic applications, and is incremental as it builds on existing parsing techniques but extends them to handle UCCA's unique formal properties.
The authors tackled the challenge of parsing UCCA, a semantic representation framework with reentrancy, discontinuous structures, and non-terminal nodes, by developing the first transition-based parser using a novel transition set and bidirectional LSTM features, achieving results that set a new benchmark for this task.
We present the first parser for UCCA, a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), discontinuous structures and non-terminal nodes corresponding to complex semantic units. To our knowledge, the conjunction of these formal properties is not supported by any existing parser. Our transition-based parser, which uses a novel transition set and features based on bidirectional LSTMs, has value not just for UCCA parsing: its ability to handle more general graph structures can inform the development of parsers for other semantic DAG structures, and in languages that frequently use discontinuous structures.