CL AIMay 16, 2023

About Evaluation of F1 Score for RECENT Relation Extraction System

arXiv:2305.09410v10.52 citations

Originality Synthesis-oriented

AI Analysis

This addresses the accuracy and reliability of evaluation metrics for relation extraction systems, but it is incremental as it focuses on correcting errors in a specific system's reported results.

The authors evaluated the F1 score of the RECENT relation extraction system, which initially claimed a state-of-the-art result of 75.2 on the TACRED dataset, but after error correction and reevaluation, the final result dropped to 65.16.

This document contains a discussion of the F1 score evaluation used in the article 'Relation Classification with Entity Type Restriction' by Shengfei Lyu, Huanhuan Chen published on Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. The authors created a system named RECENT and claim it achieves (then) a new state-of-the-art result 75.2 (previous 74.8) on the TACRED dataset, while after correcting errors and reevaluation the final result is 65.16

View on arXiv PDF

Similar