LGAIMLMar 1, 2020

Stein Variational Inference for Discrete Distributions

arXiv:2003.00605v123 citations
AI Analysis

This work addresses a gap in gradient-based inference for discrete distributions, benefiting researchers and practitioners in machine learning, particularly in areas like graphical models and neural network ensembles.

The paper tackled the problem of applying Stein variational gradient descent (SVGD) to discrete distributions by proposing a framework that transforms them into piecewise continuous distributions, enabling efficient approximate inference. The method outperformed traditional algorithms like Gibbs sampling and discontinuous Hamiltonian Monte Carlo on discrete graphical models and achieved better performance in learning binarized neural networks on CIFAR-10.

Gradient-based approximate inference methods, such as Stein variational gradient descent (SVGD), provide simple and general-purpose inference engines for differentiable continuous distributions. However, existing forms of SVGD cannot be directly applied to discrete distributions. In this work, we fill this gap by proposing a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions, on which the gradient-free SVGD is applied to perform efficient approximate inference. The empirical results show that our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo on various challenging benchmarks of discrete graphical models. We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN), outperforming other widely used ensemble methods on learning binarized AlexNet on CIFAR-10 dataset. In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions. Our proposed method outperforms existing GOF test methods for intractable discrete distributions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes