CLJun 4, 2018

History Playground: A Tool for Discovering Temporal Trends in Massive Textual Corpora

arXiv:1806.01185v113 citations
Originality Synthesis-oriented
AI Analysis

This tool addresses the need for interactive data-driven analysis in humanities and social sciences, leveraging mass-digitization to uncover patterns over centuries, though it is incremental as it builds on existing data-driven approaches.

The authors tackled the challenge of discovering macroscopic temporal trends in massive textual corpora by developing History Playground, an interactive web-based tool that uses scalable algorithms to extract and enable real-time exploration of trends, including features like standardization, regression, and change-point detection for ngrams.

Recent studies have shown that macroscopic patterns of continuity and change over the course of centuries can be detected through the analysis of time series extracted from massive textual corpora. Similar data-driven approaches have already revolutionised the natural sciences, and are widely believed to hold similar potential for the humanities and social sciences, driven by the mass-digitisation projects that are currently under way, and coupled with the ever-increasing number of documents which are "born digital". As such, new interactive tools are required to discover and extract macroscopic patterns from these vast quantities of textual data. Here we present History Playground, an interactive web-based tool for discovering trends in massive textual corpora. The tool makes use of scalable algorithms to first extract trends from textual corpora, before making them available for real-time search and discovery, presenting users with an interface to explore the data. Included in the tool are algorithms for standardization, regression, change-point detection in the relative frequencies of ngrams, multi-term indices and comparison of trends across different corpora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes