CLOct 17, 2022

CrossRE: A Cross-Domain Dataset for Relation Extraction

arXiv:2210.09345v1305 citationsh-index: 46
Originality Synthesis-oriented
AI Analysis

This addresses the problem of evaluating relation extraction systems in realistic, cross-domain scenarios for researchers, though it is incremental as it primarily provides a new dataset.

The authors tackled the lack of out-of-distribution evaluation in relation extraction by introducing CrossRE, a cross-domain benchmark with six text domains and multi-label annotations, finding that cross-domain RE is difficult based on empirical evaluation with a state-of-the-art model.

Relation Extraction (RE) has attracted increasing attention, but current RE evaluation is limited to in-domain evaluation setups. Little is known on how well a RE system fares in challenging, but realistic out-of-distribution evaluation setups. To address this gap, we propose CrossRE, a new, freely-available cross-domain benchmark for RE, which comprises six distinct text domains and includes multi-label annotations. An additional innovation is that we release meta-data collected during annotation, to include explanations and flags of difficult instances. We provide an empirical evaluation with a state-of-the-art model for relation classification. As the meta-data enables us to shed new light on the state-of-the-art model, we provide a comprehensive analysis on the impact of difficult cases and find correlations between model and human annotations. Overall, our empirical investigation highlights the difficulty of cross-domain RE. We release our dataset, to spur more research in this direction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes