LGCVMLApr 2, 2019

Correlated Parameters to Accurately Measure Uncertainty in Deep Neural Networks

arXiv:1904.01334v131 citations
Originality Incremental advance
AI Analysis

This addresses uncertainty measurement for deep learning practitioners, but appears incremental as it builds on existing Bayesian methods with specific covariance structures.

The paper tackles the problem of uncertainty quantification and overfitting in deep neural networks by proposing a Bayesian training approach using variational inference with tridiagonal covariance matrices, achieving successful evaluation on MNIST and CIFAR-10 datasets.

In this article a novel approach for training deep neural networks using Bayesian techniques is presented. The Bayesian methodology allows for an easy evaluation of model uncertainty and additionally is robust to overfitting. These are commonly the two main problems classical, i.e. non-Bayesian, architectures have to struggle with. The proposed approach applies variational inference in order to approximate the intractable posterior distribution. In particular, the variational distribution is defined as product of multiple multivariate normal distributions with tridiagonal covariance matrices. Each single normal distribution belongs either to the weights, or to the biases corresponding to one network layer. The layer-wise a posteriori variances are defined based on the corresponding expectation values and further the correlations are assumed to be identical. Therefore, only a few additional parameters need to be optimized compared to non-Bayesian settings. The novel approach is successfully evaluated on basis of the popular benchmark datasets MNIST and CIFAR-10.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes