CLFeb 5, 2020

Discontinuous Constituent Parsing with Pointer Networks

Daniel Fernández-González, Carlos Gómez-Rodríguez

arXiv:2002.01824v11.420 citationsHas Code

Originality Highly original

AI Analysis

This work addresses a complex syntactic representation problem in computational linguistics and NLP, particularly for languages with discontinuous constituents, representing a strong specific gain rather than an incremental improvement.

The paper tackled the problem of parsing discontinuous constituent trees, which are crucial for languages like German, by proposing a novel neural architecture using Pointer Networks that models these structures as augmented non-projective dependencies, achieving state-of-the-art results on NEGRA and TIGER benchmarks with significant accuracy improvements.

One of the most complex syntactic representations used in computational linguistics and NLP are discontinuous constituent trees, crucial for representing all grammatical phenomena of languages such as German. Recent advances in dependency parsing have shown that Pointer Networks excel in efficiently parsing syntactic relations between words in a sentence. This kind of sequence-to-sequence models achieve outstanding accuracies in building non-projective dependency trees, but its potential has not been proved yet on a more difficult task. We propose a novel neural network architecture that, by means of Pointer Networks, is able to generate the most accurate discontinuous constituent representations to date, even without the need of Part-of-Speech tagging information. To do so, we internally model discontinuous constituent structures as augmented non-projective dependency structures. The proposed approach achieves state-of-the-art results on the two widely-used NEGRA and TIGER benchmarks, outperforming previous work by a wide margin.

View on arXiv PDF Code

Similar