LG MLDec 2, 2019

Epistemic Uncertainty Quantification in Deep Learning Classification by the Delta Method

Geir K. Nilsen, Antonella Z. Munthe-Kaas, Hans J. Skaug, Morten Brun

arXiv:1912.00832v27.14 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses uncertainty quantification for deep learning practitioners, offering an incremental improvement by adapting a classical statistical method to neural networks.

The authors tackled the challenge of applying the Delta method for epistemic uncertainty quantification in deep neural networks by proposing a low-cost variant using top eigenpairs of the Fisher information matrix, showing that meaningful uncertainty rankings can be obtained on MNIST and CIFAR-10 datasets with false positives having higher uncertainty than true positives.

The Delta method is a classical procedure for quantifying epistemic uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters $P$. We propose a low cost variant of the Delta method applicable to $L_2$-regularized deep neural networks based on the top $K$ eigenpairs of the Fisher information matrix. We address efficient computation of full-rank approximate eigendecompositions in terms of either the exact inverse Hessian, the inverse outer-products of gradients approximation or the so-called Sandwich estimator. Moreover, we provide a bound on the approximation error for the uncertainty of the predictive class probabilities. We observe that when the smallest eigenvalue of the Fisher information matrix is near the $L_2$-regularization rate, the approximation error is close to zero even when $K\ll P$. A demonstration of the methodology is presented using a TensorFlow implementation, and we show that meaningful rankings of images based on predictive uncertainty can be obtained for two LeNet-based neural networks using the MNIST and CIFAR-10 datasets. Further, we observe that false positives have on average a higher predictive epistemic uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the classification alone.

View on arXiv PDF Code

Similar