Mention Annotations Alone Enable Efficient Domain Adaptation for Coreference Resolution
This work addresses the problem of domain adaptation for coreference resolution for NLP practitioners, offering an incremental efficiency gain in annotation.
The paper tackles the challenge of adapting coreference resolution models to new domains with costly annotations by proposing a method that requires annotating only mentions, which is nearly twice as fast than full chains, resulting in a 7-14% improvement in average F1 without extra annotator time.
Although recent neural models for coreference resolution have led to substantial improvements on benchmark datasets, transferring these models to new target domains containing out-of-vocabulary spans and requiring differing annotation schemes remains challenging. Typical approaches involve continued training on annotated target-domain data, but obtaining annotations is costly and time-consuming. We show that annotating mentions alone is nearly twice as fast as annotating full coreference chains. Accordingly, we propose a method for efficiently adapting coreference models, which includes a high-precision mention detection objective and requires annotating only mentions in the target domain. Extensive evaluation across three English coreference datasets: CoNLL-2012 (news/conversation), i2b2/VA (medical notes), and previously unstudied child welfare notes, reveals that our approach facilitates annotation-efficient transfer and results in a 7-14% improvement in average F1 without increasing annotator time.