SIIRJan 24, 2018

Understanding news story chains using information retrieval and network clustering techniques

arXiv:1801.07988v132 citations
Originality Incremental advance
AI Analysis

This addresses a gap in communication studies by enabling efficient analysis of news story chains, which are important for public recall and agenda-setting.

The authors tackled the problem of identifying connected news articles that form stories, presenting an automated method using information retrieval and network clustering techniques. They applied it to 61,864 articles and found that over 50% of news production occurs within stories.

Content analysis of news stories (whether manual or automatic) is a cornerstone of the communication studies field. However, much research is conducted at the level of individual news articles, despite the fact that news events (especially significant ones) are frequently presented as "stories" by news outlets: chains of connected articles covering the same event from different angles. These stories are theoretically highly important in terms of increasing public recall of news items and enhancing the agenda-setting power of the press. Yet thus far, the field has lacked an efficient method for detecting groups of articles which form stories in a way that enables their analysis. In this work, we present a novel, automated method for identifying linked news stories from within a corpus of articles. This method makes use of techniques drawn from the field of information retrieval to identify textual closeness of pairs of articles, and then clustering techniques taken from the field of network analysis to group these articles into stories. We demonstrate the application of the method to a corpus of 61,864 articles, and show how it can efficiently identify valid story clusters within the corpus. We use the results to make observations about the prevalence and dynamics of stories within the UK news media, showing that more than 50% of news production takes place within stories.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes