CLMay 16, 2018

Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples

Vidur Joshi, Matthew Peters, Mark Hopkins

arXiv:1805.06556v132.61139 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of parser adaptation to new domains for NLP practitioners, offering a simple, data-efficient solution that is incremental over existing methods.

The paper tackles domain adaptation for neural parsers, showing that modern word representations reduce adaptation needs for syntactically similar domains, and introduces a method using dozens of partial annotations to improve error-free parses from 45% to 73% in distant domains.

We revisit domain adaptation for parsers in the neural era. First we show that recent advances in word representations greatly diminish the need for domain adaptation when the target domain is syntactically similar to the source domain. As evidence, we train a parser on the Wall Street Jour- nal alone that achieves over 90% F1 on the Brown corpus. For more syntactically dis- tant domains, we provide a simple way to adapt a parser using only dozens of partial annotations. For instance, we increase the percentage of error-free geometry-domain parses in a held-out set from 45% to 73% using approximately five dozen training examples. In the process, we demon- strate a new state-of-the-art single model result on the Wall Street Journal test set of 94.3%. This is an absolute increase of 1.7% over the previous state-of-the-art of 92.6%.

View on arXiv PDF Code

Similar