CLOct 18, 2024

Cross-Document Event-Keyed Summarization

William Walden, Pavlo Kuchmiichuk, Alexander Martin, Chihsheng Jin, Angela Cao, Claire Sun, Curisia Allen, Aaron Steven White

arXiv:2410.14795v21.91 citationsh-index: 5Has CodeProceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of summarizing events from multiple sources for researchers and practitioners in NLP, but it is incremental as it extends an existing task to a cross-document setting.

The paper tackles cross-document event-keyed summarization (CDEKS) by introducing SEAMUS, a high-quality dataset based on expert reannotation of FAMUS, and presents baselines including fine-tuned models and prompted LLMs, showing it to be a valuable benchmark for this new task.

Event-keyed summarization (EKS) requires summarizing a specific event described in a document given the document text and an event representation extracted from it. In this work, we extend EKS to the cross-document setting (CDEKS), in which summaries must synthesize information from accounts of the same event as given by multiple sources. We introduce SEAMUS (Summaries of Events Across Multiple Sources), a high-quality dataset for CDEKS based on an expert reannotation of the FAMUS dataset for cross-document argument extraction. We present a suite of baselines on SEAMUS -- covering both smaller, fine-tuned models, as well as zero- and few-shot prompted LLMs -- along with detailed ablations and a human evaluation study, showing SEAMUS to be a valuable benchmark for this new task.

View on arXiv PDF Code

Similar