Cross-Document Event-Keyed Summarization
This work addresses the problem of summarizing events from multiple sources for researchers and practitioners in NLP, but it is incremental as it extends an existing task to a cross-document setting.
The paper tackles cross-document event-keyed summarization (CDEKS) by introducing SEAMUS, a high-quality dataset based on expert reannotation of FAMUS, and presents baselines including fine-tuned models and prompted LLMs, showing it to be a valuable benchmark for this new task.
Event-keyed summarization (EKS) requires summarizing a specific event described in a document given the document text and an event representation extracted from it. In this work, we extend EKS to the cross-document setting (CDEKS), in which summaries must synthesize information from accounts of the same event as given by multiple sources. We introduce SEAMUS (Summaries of Events Across Multiple Sources), a high-quality dataset for CDEKS based on an expert reannotation of the FAMUS dataset for cross-document argument extraction. We present a suite of baselines on SEAMUS -- covering both smaller, fine-tuned models, as well as zero- and few-shot prompted LLMs -- along with detailed ablations and a human evaluation study, showing SEAMUS to be a valuable benchmark for this new task.