Effective Representation for Easy-First Dependency Parsing
This work addresses a specific bottleneck in dependency parsing for NLP researchers, representing an incremental improvement over existing easy-first approaches.
The paper tackles the problem of representing intermediate subtrees in easy-first dependency parsing by introducing a bottom-up subtree encoding method based on child-sum tree-LSTM, which improves parsing performance and achieves state-of-the-art results on benchmark treebanks when combined with pre-trained language models.
Easy-first parsing relies on subtree re-ranking to build the complete parse tree. Whereas the intermediate state of parsing processing is represented by various subtrees, whose internal structural information is the key lead for later parsing action decisions, we explore a better representation for such subtrees. In detail, this work introduces a bottom-up subtree encoding method based on the child-sum tree-LSTM. Starting from an easy-first dependency parser without other handcraft features, we show that the effective subtree encoder does promote the parsing process, and can make a greedy search easy-first parser achieve promising results on benchmark treebanks compared to state-of-the-art baselines. Furthermore, with the help of the current pre-training language model, we further improve the state-of-the-art results of the easy-first approach.