PR ST MLAug 28, 2018

Mean Field Analysis of Neural Networks: A Central Limit Theorem

Justin Sirignano, Konstantinos Spiliopoulos

arXiv:1808.09372v234.2216 citations

Originality Incremental advance

AI Analysis

This provides theoretical foundations for understanding neural network behavior in large-scale settings, though it is incremental to existing mean-field analysis.

The authors rigorously proved a central limit theorem for single-hidden-layer neural networks in the asymptotic regime of large hidden units and many stochastic gradient descent iterations, showing that fluctuations around the mean-field limit follow a Gaussian distribution described by a stochastic partial differential equation.

We rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result describes the neural network's fluctuations around its mean-field limit. The fluctuations have a Gaussian distribution and satisfy a stochastic partial differential equation. The proof relies upon weak convergence methods from stochastic analysis. In particular, we prove relative compactness for the sequence of processes and uniqueness of the limiting process in a suitable Sobolev space.

View on arXiv PDF

Similar