Boris Nectoux

27.4NEJun 4

Quantifying Uncertainty In Wide Two-Layer Neural Networks: On The Law Of The Limiting Fluctuation Process

Arnaud Descours, Arnaud Guillin, Geoffrey Lacour et al.

Uncertainty quantification in neural networks prediction is a main issue for usual applications. Our approach seeks at reducing computation costs by directly evaluating uncertainty using PDE's information on the asymptotic variance, rather than the deep ensemble method which may be seen as a Monte Carlo estimation of the prediction, requiring the training of multiple networks. We thus study the law of the limiting process describing the random fluctuations around the mean-field limit of wide two-layer neural networks trained by stochastic gradient descent in a weak-noise regime. Building on a recent trajectorial central limit theorem, in which this limit is characterized as the weak solution of a linear stochastic evolution equation, we identify its law explicitly. More precisely, we show that it is a centered Gaussian process in the dual of a weighted Sobolev space, and we derive a closed covariance representation for the finite-dimensional distributions obtained by testing it against smooth functions. This covariance is expressed through the solution of a backward transport equation with a nonlocal source term, whose coefficients are driven by the mean-field trajectory. As a consequence, by testing against the activation function at a fixed input, we obtain an expression for the limiting variance of the corresponding network-output fluctuations. We illustrate this result numerically on a one-dimensional regression example.

MLJun 10, 2024

Central Limit Theorem for Bayesian Neural Network trained with Variational Inference

Arnaud Descours, Tom Huix, Arnaud Guillin et al.

In this paper, we rigorously derive Central Limit Theorems (CLT) for Bayesian two-layerneural networks in the infinite-width limit and trained by variational inference on a regression task. The different networks are trained via different maximization schemes of the regularized evidence lower bound: (i) the idealized case with exact estimation of a multiple Gaussian integral from the reparametrization trick, (ii) a minibatch scheme using Monte Carlo sampling, commonly known as Bayes-by-Backprop, and (iii) a computationally cheaper algorithm named Minimal VI. The latter was recently introduced by leveraging the information obtained at the level of the mean-field limit. Laws of large numbers are already rigorously proven for the three schemes that admits the same asymptotic limit. By deriving CLT, this work shows that the idealized and Bayes-by-Backprop schemes have similar fluctuation behavior, that is different from the Minimal VI one. Numerical experiments then illustrate that the Minimal VI scheme is still more efficient, in spite of bigger variances, thanks to its important gain in computational complexity.

Boris Nectoux

2 Papers