MLSTJun 12, 2015

On the properties of variational approximations of Gibbs posteriors

arXiv:1506.04091v2284 citations
AI Analysis

This work addresses the computational bottleneck of slow sampling for big datasets in PAC-Bayesian learning, offering a faster alternative with theoretical guarantees, though it is incremental as it builds on existing variational methods.

The paper tackles the intractability of Gibbs posteriors in PAC-Bayesian methods by proposing variational approximations, finding that these approximations often achieve the same convergence rate as the original procedures. It demonstrates this with applications in classification, ranking, and matrix completion, showing good performance on real datasets.

The PAC-Bayesian approach is a powerful set of techniques to derive non- asymptotic risk bounds for random estimators. The corresponding optimal distribution of estimators, usually called the Gibbs posterior, is unfortunately intractable. One may sample from it using Markov chain Monte Carlo, but this is often too slow for big datasets. We consider instead variational approximations of the Gibbs posterior, which are fast to compute. We undertake a general study of the properties of such approximations. Our main finding is that such a variational approximation has often the same rate of convergence as the original PAC-Bayesian procedure it approximates. We specialise our results to several learning tasks (classification, ranking, matrix completion),discuss how to implement a variational approximation in each case, and illustrate the good properties of said approximation on real datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes