CLOct 13, 2022

An Empirical Study on Finding Spans

Microsoft
arXiv:2210.06824v2293 citationsh-index: 60
Originality Synthesis-oriented
AI Analysis

This provides practical guidance for designing span-finding components in information extraction systems, though it is incremental as it synthesizes existing methods.

The paper empirically studies span-finding methods for information extraction, finding that no single approach works best across tasks, with tagging yielding higher precision while enumeration and boundary prediction offer higher recall.

We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. We focus on approaches that can be employed in training end-to-end information extraction systems, and find there is no definitive solution without considering task properties, and provide our observations to help with future design choices: 1) a tagging approach often yields higher precision while span enumeration and boundary prediction provide higher recall; 2) span type information can benefit a boundary prediction approach; 3) additional contextualization does not help span finding in most cases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes