CLFeb 20, 2023

90% F1 Score in Relational Triple Extraction: Is it Real ?

Pratik Saini, Samiran Pal, Tapas Nayak, Indrajit Bhattacharya

arXiv:2302.09887v20.93 citationsh-index: 23

Originality Incremental advance

AI Analysis

This work addresses the issue of inflated benchmarks for knowledge base construction, showing that incremental improvements are needed for real-world applications.

The paper tackles the problem of overestimated performance in relational triple extraction by evaluating state-of-the-art models under a more realistic setting that includes sentences with zero triples, revealing a significant decline of 6-15% in F1 scores across datasets. It proposes a two-step BERT-based approach that improves performance in this setting.

Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentences with zero triples (zero-cardinality), thereby simplifying the task. In this paper, we present a benchmark study of state-of-the-art joint entity and relation extraction models under a more realistic setting. We include sentences that lack any triples in our experiments, providing a comprehensive evaluation. Our findings reveal a significant decline (approximately 10-15\% in one dataset and 6-14\% in another dataset) in the models' F1 scores within this realistic experimental setup. Furthermore, we propose a two-step modeling approach that utilizes a simple BERT-based classifier. This approach leads to overall performance improvement in these models within the realistic experimental setting.

View on arXiv PDF

Similar