CLMay 27, 2020

Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing

arXiv:2005.13334v1998 citations
Originality Incremental advance
AI Analysis

This work addresses parsing efficiency and accuracy for NLP researchers, presenting an incremental improvement over existing methods.

The paper tackled the problem of sequence-to-sequence constituent parsing by proposing an enriched in-order linearization, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model parsers and matching state-of-the-art transition-based parsers in speed.

Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences. Top-down tree linearizations, which can be based on brackets or shift-reduce actions, have achieved the best accuracy to date. In this paper, we show that these results can be improved by using an in-order linearization instead. Based on this observation, we implement an enriched in-order shift-reduce linearization inspired by Vinyals et al. (2015)'s approach, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model sequence-to-sequence constituent parsers. Finally, we apply deterministic attention mechanisms to match the speed of state-of-the-art transition-based parsers, thus showing that sequence-to-sequence models can match them, not only in accuracy, but also in speed.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes