LGMLJan 6, 2016

Streaming Gibbs Sampling for LDA Model

arXiv:1601.01142v116 citations
Originality Incremental advance
AI Analysis

This provides a scalable online Monte-Carlo solution for topic modeling, addressing a bottleneck in streaming data applications.

The paper tackled the problem of online learning for LDA models by developing a streaming Gibbs sampling method, achieving similar perplexity to batch collapsed Gibbs sampling and outperforming streaming variational Bayes.

Streaming variational Bayes (SVB) is successful in learning LDA models in an online manner. However previous attempts toward developing online Monte-Carlo methods for LDA have little success, often by having much worse perplexity than their batch counterparts. We present a streaming Gibbs sampling (SGS) method, an online extension of the collapsed Gibbs sampling (CGS). Our empirical study shows that SGS can reach similar perplexity as CGS, much better than SVB. Our distributed version of SGS, DSGS, is much more scalable than SVB mainly because the updates' communication complexity is small.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes