CLNov 7, 2016

Presenting a New Dataset for the Timeline Generation Problem

arXiv:1611.02025v119 citations
Originality Synthesis-oriented
AI Analysis

This provides a resource for researchers working on timeline generation from text, though it is incremental as it focuses on dataset creation rather than algorithmic advancement.

The paper tackles the lack of a standard dataset and evaluation method for timeline generation by presenting a new dataset of 18,793 news articles covering 39 entities, with gold standard timelines, and validates it by showing top Google results outperform baselines.

The timeline generation task summarises an entity's biography by selecting stories representing key events from a large pool of relevant documents. This paper addresses the lack of a standard dataset and evaluative methodology for the problem. We present and make publicly available a new dataset of 18,793 news articles covering 39 entities. For each entity, we provide a gold standard timeline and a set of entity-related articles. We propose ROUGE as an evaluation metric and validate our dataset by showing that top Google results outperform straw-man baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes