MLLGJul 25, 2013

Streaming Variational Bayes

arXiv:1307.6769v2365 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of scalable Bayesian inference for streaming data, which is incremental as it builds on existing variational methods but extends them to distributed and asynchronous contexts.

The authors tackled the problem of performing Bayesian inference on large-scale data streams by introducing SDA-Bayes, a framework for streaming, distributed, and asynchronous computation of Bayesian posteriors, which they applied to latent Dirichlet allocation on two large document collections and showed advantages over stochastic variational inference in streaming settings.

We present SDA-Bayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior. The framework makes streaming updates to the estimated posterior according to a user-specified approximation batch primitive. We demonstrate the usefulness of our framework, with variational Bayes (VB) as the primitive, by fitting the latent Dirichlet allocation model to two large-scale document collections. We demonstrate the advantages of our algorithm over stochastic variational inference (SVI) by comparing the two after a single pass through a known amount of data---a case where SVI may be applied---and in the streaming setting, where SVI does not apply.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes