CLFeb 5, 2020

Discontinuous Constituent Parsing with Pointer Networks

arXiv:2002.01824v120 citations
AI Analysis

This work addresses a complex syntactic representation problem in computational linguistics and NLP, particularly for languages with discontinuous constituents, representing a strong specific gain rather than an incremental improvement.

The paper tackled the problem of parsing discontinuous constituent trees, which are crucial for languages like German, by proposing a novel neural architecture using Pointer Networks that models these structures as augmented non-projective dependencies, achieving state-of-the-art results on NEGRA and TIGER benchmarks with significant accuracy improvements.

One of the most complex syntactic representations used in computational linguistics and NLP are discontinuous constituent trees, crucial for representing all grammatical phenomena of languages such as German. Recent advances in dependency parsing have shown that Pointer Networks excel in efficiently parsing syntactic relations between words in a sentence. This kind of sequence-to-sequence models achieve outstanding accuracies in building non-projective dependency trees, but its potential has not been proved yet on a more difficult task. We propose a novel neural network architecture that, by means of Pointer Networks, is able to generate the most accurate discontinuous constituent representations to date, even without the need of Part-of-Speech tagging information. To do so, we internally model discontinuous constituent structures as augmented non-projective dependency structures. The proposed approach achieves state-of-the-art results on the two widely-used NEGRA and TIGER benchmarks, outperforming previous work by a wide margin.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes