Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection
This addresses a methodological issue in coreference resolution evaluation for NLP researchers, providing a scalable alternative to costly manual annotation of minimum spans.
The paper tackles the problem of coreference evaluation being entangled with mention boundary detection challenges by proposing MINA, an algorithm for automatically extracting minimum spans from text. Experiments show MINA's extracted spans are consistent with expert annotations and particularly important for cross-dataset evaluation where domain shift creates noisier boundaries.
The common practice in coreference resolution is to identify and evaluate the maximum span of mentions. The use of maximum spans tangles coreference evaluation with the challenges of mention boundary detection like prepositional phrase attachment. To address this problem, minimum spans are manually annotated in smaller corpora. However, this additional annotation is costly and therefore, this solution does not scale to large corpora. In this paper, we propose the MINA algorithm for automatically extracting minimum spans to benefit from minimum span evaluation in all corpora. We show that the extracted minimum spans by MINA are consistent with those that are manually annotated by experts. Our experiments show that using minimum spans is in particular important in cross-dataset coreference evaluation, in which detected mention boundaries are noisier due to domain shift. We will integrate MINA into https://github.com/ns-moosavi/coval for reporting standard coreference scores based on both maximum and automatically detected minimum spans.