MLMar 10, 2018

Variance Networks: When Expectation Does Not Meet Your Expectations

arXiv:1803.03764v527 citations
Originality Highly original
AI Analysis

This work addresses a fundamental issue in stochastic neural networks for researchers and practitioners in machine learning, offering a novel approach that is incremental but with broad applications.

The paper tackles the limitation of stochastic neural networks that rely on expected weight values by introducing variance layers, where weights follow zero-mean distributions parameterized only by variance, and shows that these layers learn effectively, improve exploration in reinforcement learning, and provide defense against adversarial attacks.

Ordinary stochastic neural networks mostly rely on the expected values of their weights to make predictions, whereas the induced noise is mostly used to capture the uncertainty, prevent overfitting and slightly boost the performance through test-time averaging. In this paper, we introduce variance layers, a different kind of stochastic layers. Each weight of a variance layer follows a zero-mean distribution and is only parameterized by its variance. We show that such layers can learn surprisingly well, can serve as an efficient exploration tool in reinforcement learning tasks and provide a decent defense against adversarial attacks. We also show that a number of conventional Bayesian neural networks naturally converge to such zero-mean posteriors. We observe that in these cases such zero-mean parameterization leads to a much better training objective than conventional parameterizations where the mean is being learned.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes