CLLGApr 30, 2020

A Span-based Linearization for Constituent Trees

arXiv:2004.14704v2998 citations
AI Analysis

This improves parsing efficiency and interpretability for NLP researchers, though it appears incremental over existing local/global approaches.

The paper tackles constituent parsing by proposing a span-based linearization method and locally normalized model that predicts tree spans at each split point. Experiments show state-of-the-art results on PTB (95.8 F1) and CTB (92.4 F1), outperforming local models and matching global models efficiently.

We propose a novel linearization of a constituent tree, together with a new locally normalized model. For each split point in a sentence, our model computes the normalizer on all spans ending with that split point, and then predicts a tree span from them. Compared with global models, our model is fast and parallelizable. Different from previous local models, our linearization method is tied on the spans directly and considers more local features when performing span prediction, which is more interpretable and effective. Experiments on PTB (95.8 F1) and CTB (92.4 F1) show that our model significantly outperforms existing local models and efficiently achieves competitive results with global models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes