Quantitative Gaussian Approximation of Randomly Initialized Deep Neural Networks
This work provides theoretical guarantees for the Gaussian approximation of neural networks, which is incremental but important for understanding initialization in deep learning.
The authors derived explicit upper bounds on the Wasserstein distance between the output distribution of randomly initialized deep neural networks and a Gaussian process, quantifying how layer sizes affect Gaussian behavior and recovering distributional convergence in the wide limit.
Given any deep fully connected neural network, initialized with random Gaussian parameters, we bound from above the quadratic Wasserstein distance between its output distribution and a suitable Gaussian process. Our explicit inequalities indicate how the hidden and output layers sizes affect the Gaussian behaviour of the network and quantitatively recover the distributional convergence results in the wide limit, i.e., if all the hidden layers sizes become large.