CLSep 8, 2019

Evaluating Topic Quality with Posterior Variability

Linzi Xing, Michael J. Paul, Giuseppe Carenini

arXiv:1909.03524v230.0996 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for reliable automatic evaluation of topic models, which is incremental as it builds on existing methods but introduces a new metric.

The paper tackled the problem of automatically evaluating topic quality in probabilistic topic models like LDA by deriving a novel metric based on posterior variability, achieving state-of-the-art correlations with human judgments on three corpora.

Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters. We derive a novel measure of LDA topic quality using the variability of the posterior distributions. Compared to several existing baselines for automatic topic evaluation, the proposed metric achieves state-of-the-art correlations with human judgments of topic quality in experiments on three corpora. We additionally demonstrate that topic quality estimation can be further improved using a supervised estimator that combines multiple metrics.

View on arXiv PDF Code

Similar