MLLGOct 11, 2018

Understanding Priors in Bayesian Neural Networks at the Unit Level

arXiv:1810.05193v225 citations
Originality Incremental advance
AI Analysis

This provides theoretical insight into regularization effects in Bayesian neural networks, though it is incremental as it builds on known L2 regularization to analyze unit-level behavior.

The paper investigates deep Bayesian neural networks with Gaussian weight priors and ReLU-like nonlinearities, revealing that the induced prior distribution on unit activations becomes increasingly heavy-tailed with layer depth, with first layer units Gaussian, second layer sub-exponential, and deeper layers sub-Weibull, as supported by simulation experiments.

We investigate deep Bayesian neural networks with Gaussian weight priors and a class of ReLU-like nonlinearities. Bayesian neural networks with Gaussian priors are well known to induce an L2, "weight decay", regularization. Our results characterize a more intricate regularization effect at the level of the unit activations. Our main result establishes that the induced prior distribution on the units before and after activation becomes increasingly heavy-tailed with the depth of the layer. We show that first layer units are Gaussian, second layer units are sub-exponential, and units in deeper layers are characterized by sub-Weibull distributions. Our results provide new theoretical insight on deep Bayesian neural networks, which we corroborate with simulation experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes