CLOct 31, 2018

Picking Apart Story Salads

arXiv:1810.13391v11090 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for automated narrative reconstruction in crisis scenarios, but it is incremental as it focuses on a new dataset and task formulation rather than a breakthrough method.

The paper tackles the problem of automatically assembling coherent narratives from messy, distributed information during natural disasters and conflicts by introducing Story Salads, a dataset generated from Wikipedia to simulate challenging inference tasks, and shows that simple bag-of-words clustering fails, requiring global context and coherence for effective grouping.

During natural disasters and conflicts, information about what happened is often confusing, messy, and distributed across many sources. We would like to be able to automatically identify relevant information and assemble it into coherent narratives of what happened. To make this task accessible to neural models, we introduce Story Salads, mixtures of multiple documents that can be generated at scale. By exploiting the Wikipedia hierarchy, we can generate salads that exhibit challenging inference problems. Story salads give rise to a novel, challenging clustering task, where the objective is to group sentences from the same narratives. We demonstrate that simple bag-of-words similarity clustering falls short on this task and that it is necessary to take into account global context and coherence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes