LGJul 4, 2021
A Comparison of the Delta Method and the Bootstrap in Deep Learning ClassificationGeir K. Nilsen, Antonella Z. Munthe-Kaas, Hans J. Skaug et al.
We validate the recently introduced deep learning classification adapted Delta method by a comparison with the classical Bootstrap. We show that there is a strong linear relationship between the quantified predictive epistemic uncertainty levels obtained from the two methods when applied on two LeNet-based neural network classifiers using the MNIST and CIFAR-10 datasets. Furthermore, we demonstrate that the Delta method offers a five times computation time reduction compared to the Bootstrap.
LGDec 2, 2019
Epistemic Uncertainty Quantification in Deep Learning Classification by the Delta MethodGeir K. Nilsen, Antonella Z. Munthe-Kaas, Hans J. Skaug et al.
The Delta method is a classical procedure for quantifying epistemic uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters $P$. We propose a low cost variant of the Delta method applicable to $L_2$-regularized deep neural networks based on the top $K$ eigenpairs of the Fisher information matrix. We address efficient computation of full-rank approximate eigendecompositions in terms of either the exact inverse Hessian, the inverse outer-products of gradients approximation or the so-called Sandwich estimator. Moreover, we provide a bound on the approximation error for the uncertainty of the predictive class probabilities. We observe that when the smallest eigenvalue of the Fisher information matrix is near the $L_2$-regularization rate, the approximation error is close to zero even when $K\ll P$. A demonstration of the methodology is presented using a TensorFlow implementation, and we show that meaningful rankings of images based on predictive uncertainty can be obtained for two LeNet-based neural networks using the MNIST and CIFAR-10 datasets. Further, we observe that false positives have on average a higher predictive epistemic uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the classification alone.
LGMay 14, 2019
Efficient Computation of Hessian Matrices in TensorFlowGeir K. Nilsen, Antonella Z. Munthe-Kaas, Hans J. Skaug et al.
The Hessian matrix has a number of important applications in a variety of different fields, such as optimzation, image processing and statistics. In this paper we focus on the practical aspects of efficiently computing Hessian matrices in the context of deep learning using the Python scripting language and the TensorFlow library. We define a general feed-forward neural network model and show how to efficiently compute two quantities: the cost function's exact Hessian matrix, and the cost function's approximate Hessian matrix, known as the Outer Product of Gradients (OPG) matrix. Furthermore, as the number of parameters (weights and biases) in deep learning usually is very large, we show how to reduce the quadratic space complexity by an efficient implementation based on approximate eigendecompositions.