Optimal Non-Asymptotic Edgeworth Expansions for Multivariate Neural Network Outputs

arXiv:2605.240726.0

Predicted impact top 78% in ML · last 90 daysOriginality Incremental advance

AI Analysis

Provides rigorous theoretical guarantees for higher-order corrections to the Gaussian approximation of neural networks, relevant for understanding finite-width effects in deep learning theory.

The paper derives non-asymptotic Edgeworth expansions of arbitrary order for finite-width neural networks, proving total variation distance bounds of order n^{-m} with matching lower bounds, and applies them to quantify Bayesian posterior approximation errors.

Finite-width fully connected neural networks with Gaussian-initialized weights deviate from their infinite-width Gaussian limit, exhibiting non-vanishing higher-order cumulants. We approximate these deviations, for a neural network evaluated in a finite number of inputs, using multidimensional Edgeworth expansions of arbitrary order $4m-1$, with $m\in\mathbb{N}$. Assuming that the corresponding Gaussian limit has an invertible covariance matrix and that the activation function is polynomially bounded, we establish a bound of order $n^{-m}$ on the total variation distance between the law of the true network output and its Edgeworth approximation, with matching lower bounds. As an application, we quantify the error in Bayesian posterior distributions when the prior is replaced by its Edgeworth expansion. Our results are more general and also apply to sequences of conditionally Gaussian vectors converging to a Gaussian vector with invertible covariance.

View on arXiv PDF

Similar