MLAILGOct 13, 2017

Bayesian Hypernetworks

arXiv:1710.04759v2156 citations
Originality Highly original
AI Analysis

This addresses the problem of uncertainty quantification and robustness in deep learning for practitioners, though it is incremental as it builds on existing variational inference and hypernetwork frameworks.

The paper tackles the challenge of approximate Bayesian inference in neural networks by introducing Bayesian hypernetworks, which transform a simple noise distribution into a complex multimodal posterior over parameters, enabling efficient sampling. The result shows competitive performance on tasks like adversarial defense, regularization, active learning, and anomaly detection, with improvements over methods like dropout.

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork $\h$ is a neural network which learns to transform a simple noise distribution, $p(\vecε) = \N(\vec 0,\mat I)$, to a distribution $q(\pp) := q(h(\vecε))$ over the parameters $\pp$ of another neural network (the "primary network")\@. We train $q$ with variational inference, using an invertible $\h$ to enable efficient estimation of the variational lower bound on the posterior $p(\pp | \D)$ via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~$q(\pp)$. In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes