MLAug 22, 2012

A non-parametric mixture model for topic modeling over time

arXiv:1208.4411v166 citations
Originality Incremental advance
AI Analysis

This addresses the need for more flexible and computationally efficient time-varying topic models for researchers analyzing long-term textual data, though it is incremental relative to prior work.

The paper tackles the problem of modeling topic popularity changes over time in corpora by proposing npTOT, a non-parametric model that allows an unbounded number of topics and flexible temporal variations, and demonstrates its effectiveness through comparisons on synthetic and real datasets.

A single, stationary topic model such as latent Dirichlet allocation is inappropriate for modeling corpora that span long time periods, as the popularity of topics is likely to change over time. A number of models that incorporate time have been proposed, but in general they either exhibit limited forms of temporal variation, or require computationally expensive inference methods. In this paper we propose non-parametric Topics over Time (npTOT), a model for time-varying topics that allows an unbounded number of topics and exible distribution over the temporal variations in those topics' popularity. We develop a collapsed Gibbs sampler for the proposed model and compare against existing models on synthetic and real document sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes