A Span-based Linearization for Constituent Trees
This improves parsing efficiency and interpretability for NLP researchers, though it appears incremental over existing local/global approaches.
The paper tackles constituent parsing by proposing a span-based linearization method and locally normalized model that predicts tree spans at each split point. Experiments show state-of-the-art results on PTB (95.8 F1) and CTB (92.4 F1), outperforming local models and matching global models efficiently.
We propose a novel linearization of a constituent tree, together with a new locally normalized model. For each split point in a sentence, our model computes the normalizer on all spans ending with that split point, and then predicts a tree span from them. Compared with global models, our model is fast and parallelizable. Different from previous local models, our linearization method is tied on the spans directly and considers more local features when performing span prediction, which is more interpretable and effective. Experiments on PTB (95.8 F1) and CTB (92.4 F1) show that our model significantly outperforms existing local models and efficiently achieves competitive results with global models.