MLOct 27, 2016

Geometric Dirichlet Means algorithm for topic inference

arXiv:1610.09034v119 citations
Originality Incremental advance
AI Analysis

This addresses computational bottlenecks in topic modeling for researchers and practitioners, though it appears incremental as it builds on existing LDA frameworks.

The authors tackled the problem of topic learning and inference by proposing a geometric algorithm based on the convex geometry of topics from LDA models, achieving accuracy comparable to Gibbs sampling while overcoming computational inefficiencies.

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes