MELGOct 24, 2024

Evolving Voices Based on Temporal Poisson Factorisation

arXiv:2410.18486v22 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses the need for flexible topic models to track changes in vocabulary and topic prevalence over time in political speech data, representing an incremental advancement in topic modeling for temporal text analysis.

The authors tackled the problem of analyzing evolving vocabulary in political speech over 30+ years by proposing the temporal Poisson factorisation (TPF) model, which extends Poisson factorisation to handle sparse count data with timestamps, and they empirically compared different model specifications and variational inference methods on U.S. Senate speeches from 1981-2016.

The world is evolving and so is the vocabulary used to discuss topics in speech. Analysing political speech data from more than 30 years requires the use of flexible topic models to uncover the latent topics and their change in prevalence over time as well as the change in the vocabulary of the topics. We propose the temporal Poisson factorisation (TPF) model as an extension to the Poisson factorisation model to model sparse count data matrices obtained based on the bag-of-words assumption from text documents with time stamps. We discuss and empirically compare different model specifications for the time-varying latent variables consisting either of a flexible auto-regressive structure of order one or a random walk. Estimation is based on variational inference where we consider a combination of coordinate ascent updates with automatic differentiation using batching of documents. Suitable variational families are proposed to ease inference. We compare results obtained using independent univariate variational distributions for the time-varying latent variables to those obtained with a multivariate variant. We discuss in detail the results of the TPF model when analysing speeches from 18 sessions in the U.S. Senate (1981-2016).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes