CLApr 6, 2019

Speeding Up Natural Language Parsing by Reusing Partial Results

Michalina Strzyz, Carlos Gómez-Rodríguez

arXiv:1904.03417v10.21 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency improvements for NLP practitioners by incrementally enhancing parsing speed at a minor accuracy cost.

The paper tackles the problem of speeding up natural language dependency parsing by reusing partial parse results via case-based reasoning templates, achieving a 20% reduction in input length with less than a 3-point drop in accuracy.

This paper proposes a novel technique that applies case-based reasoning in order to generate templates for reusable parse tree fragments, based on PoS tags of bigrams and trigrams that demonstrate low variability in their syntactic analyses from prior data. The aim of this approach is to improve the speed of dependency parsers by avoiding redundant calculations. This can be resolved by applying the predefined templates that capture results of previous syntactic analyses and directly assigning the stored structure to a new n-gram that matches one of the templates, instead of parsing a similar text fragment again. The study shows that using a heuristic approach to select and reuse the partial results increases parsing speed by reducing the input length to be processed by a parser. The increase in parsing speed comes at some expense of accuracy. Experiments on English show promising results: the input dimension can be reduced by more than 20% at the cost of less than 3 points of Unlabeled Attachment Score.

View on arXiv PDF

Similar