CLOct 25, 2022

CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization

Hossein Rajaby Faghihi, Bashar Alhafni, Ke Zhang, Shihao Ran, Joel Tetreault, Alejandro Jaimes

arXiv:2210.14190v124.2294 citationsh-index: 74Has Code

Originality Synthesis-oriented

AI Analysis

This provides a benchmark for researchers and practitioners in emergency response to improve technical approaches for leveraging social media data during local crises, though it is incremental as it builds on existing data collection methods.

The authors tackled the lack of datasets for benchmarking timeline extraction and summarization from social media during crises by creating CrisisLTLSum, the largest dataset of local crisis event timelines with 1,000 timelines across four domains, and found a significant performance gap between strong baselines and human performance.

Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents CrisisLTLSum, the largest dataset of local crisis event timelines available to date. CrisisLTLSum contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built CrisisLTLSum using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks. Our dataset, code, and models are publicly available.

View on arXiv PDF Code

Similar