Yi-Fan Lu

CL
h-index26
3papers
5citations
Novelty53%
AI Score32

3 Papers

CLMar 5, 2025
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection

Yi-Fan Lu, Xian-Ling Mao, Tian Lan et al.

Automatic evaluation for Open Domain Event Detection (ODED) is a highly challenging task, because ODED is characterized by a vast diversity of un-constrained output labels from various domains. Nearly all existing evaluation methods for ODED usually first construct evaluation benchmarks with limited labels and domain coverage, and then evaluate ODED methods using metrics based on token-level label matching rules. However, this kind of evaluation framework faces two issues: (1) The limited evaluation benchmarks lack representatives of the real world, making it difficult to accurately reflect the performance of various ODED methods in real-world scenarios; (2) Evaluation metrics based on token-level matching rules fail to capture semantic similarity between predictions and golden labels. To address these two problems above, we propose a scalable and reliable Semantic-level Evaluation framework for Open domain Event detection (SEOE) by constructing a more representative evaluation benchmark and introducing a semantic evaluation metric. Specifically, our proposed framework first constructs a scalable evaluation benchmark that currently includes 564 event types covering 7 major domains, with a cost-effective supplementary annotation strategy to ensure the benchmark's representativeness. The strategy also allows for the supplement of new event types and domains in the future. Then, the proposed SEOE leverages large language models (LLMs) as automatic evaluation agents to compute a semantic F1-score, incorporating fine-grained definitions of semantically similar labels to enhance the reliability of the evaluation. Extensive experiments validate the representatives of the benchmark and the reliability of the semantic evaluation metric. Existing ODED methods are thoroughly evaluated, and the error patterns of predictions are analyzed, revealing several insightful findings.

CLOct 12, 2024
Beyond Exact Match: Semantically Reassessing Event Extraction by Large Language Models

Yi-Fan Lu, Xian-Ling Mao, Tian Lan et al.

Event extraction has gained extensive research attention due to its broad range of applications. However, the current mainstream evaluation method for event extraction relies on token-level exact match, which misjudges numerous semantic-level correct cases. This reliance leads to a significant discrepancy between the evaluated performance of models under exact match criteria and their real performance. To address this problem, we propose a reliable and semantic evaluation framework for event extraction, named RAEE, which accurately assesses extraction results at semantic-level instead of token-level. Specifically, RAEE leverages large language models (LLMs) as evaluation agents, incorporating an adaptive mechanism to achieve adaptive evaluations for precision and recall of triggers and arguments. Extensive experiments demonstrate that: (1) RAEE achieves a very strong correlation with human judgments; (2) after reassessing 14 models, including advanced LLMs, on 10 datasets, there is a significant performance gap between exact match and RAEE. The exact match evaluation significantly underestimates the performance of existing event extraction models, and in particular underestimates the capabilities of LLMs; (3) fine-grained analysis under RAEE evaluation reveals insightful phenomena worth further exploration. The evaluation toolkit of our proposed RAEE is publicly released.

CLJun 20, 2024
EXCEEDS: Extracting Complex Events as Connecting the Dots to Graphs in Scientific Domain

Yi-Fan Lu, Xian-Ling Mao, Bo Wang et al.

It is crucial to utilize events to understand a specific domain. There are lots of research on event extraction in many domains such as news, finance and biology domain. However, scientific domain still lacks event extraction research, including comprehensive datasets and corresponding methods. Compared to other domains, scientific domain presents two characteristics: denser nuggets and more complex events. To solve the above problem, considering these two characteristics, we first construct SciEvents, a large-scale multi-event document-level dataset with a schema tailored for scientific domain. It has 2,508 documents and 24,381 events under refined annotation and quality control. Then, we propose EXCEEDS, a novel end-to-end scientific event extraction framework by storing dense nuggets in a grid matrix and simplifying complex event extraction into a dot construction and connection task. Experimental results demonstrate state-of-the-art performances of EXCEEDS on SciEvents. Additionally, we release SciEvents and EXCEEDS on GitHub.