MLNov 10, 2015

Black-box $α$-divergence Minimization

José Miguel Hernández-Lobato, Yingzhen Li, Mark Rowland, Daniel Hernández-Lobato, Thang Bui, Richard E. Turner

arXiv:1511.03243v318.4152 citationsh-index: 42

Originality Incremental advance

AI Analysis

This provides a flexible inference method for practitioners working with complex probabilistic models, though it appears incremental as it builds on existing divergence minimization techniques.

The authors tackled the problem of approximate inference in probabilistic models by introducing BB-α, a method that minimizes α-divergences and scales to large datasets using stochastic gradient descent. Experiments on probit regression and neural networks showed that non-standard α settings, like α=0.5, often yield better predictions than variational Bayes or expectation propagation.

Black-box alpha (BB-$α$) is a new approximate inference method based on the minimization of $α$-divergences. BB-$α$ scales to large datasets because it can be implemented using stochastic gradient descent. BB-$α$ can be applied to complex probabilistic models with little effort since it only requires as input the likelihood function and its gradients. These gradients can be easily obtained using automatic differentiation. By changing the divergence parameter $α$, the method is able to interpolate between variational Bayes (VB) ($α\rightarrow 0$) and an algorithm similar to expectation propagation (EP) ($α= 1$). Experiments on probit regression and neural network regression and classification problems show that BB-$α$ with non-standard settings of $α$, such as $α= 0.5$, usually produces better predictions than with $α\rightarrow 0$ (VB) or $α= 1$ (EP).

View on arXiv PDF

Similar