MLOct 9, 2017

Conic Scan-and-Cover algorithms for nonparametric topic modeling

arXiv:1710.02952v116 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of unknown topic counts in topic modeling, which is incremental as it builds on existing geometric analyses.

The authors tackled the problem of topic modeling when the number of topics is unknown by proposing new algorithms based on the geometry of the topic simplex, achieving accuracy comparable to a Gibbs sampler and being among the fastest state-of-the-art parametric techniques.

We propose new algorithms for topic modeling when the number of topics is unknown. Our approach relies on an analysis of the concentration of mass and angular geometry of the topic simplex, a convex polytope constructed by taking the convex hull of vertices representing the latent topics. Our algorithms are shown in practice to have accuracy comparable to a Gibbs sampler in terms of topic estimation, which requires the number of topics be given. Moreover, they are one of the fastest among several state of the art parametric techniques. Statistical consistency of our estimator is established under some conditions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes