CLJul 9, 2019

Cross-Domain Generalization of Neural Constituency Parsers

arXiv:1907.04347v11103 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of domain adaptation for NLP researchers and practitioners, but it is incremental as it builds on existing methods to analyze generalization.

The study investigated the cross-domain generalization of neural constituency parsers in a zero-shot setting, finding that incorporating pre-trained encoders improves performance across domains and structured output prediction enhances generalization, achieving state-of-the-art results on Brown, Genia, and English Web treebanks.

Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing -- but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes