NE NAOct 9, 2018

Collective evolution of weights in wide neural networks

arXiv:1810.03974v18.15 citations

Originality Incremental advance

AI Analysis

This provides a theoretical framework for understanding training dynamics in large neural networks, which is incremental but useful for researchers in machine learning theory.

The authors derived a transport equation to model weight evolution in wide neural networks under gradient descent, and validated it with linear free-knot splines, showing good agreement in optima, stability, and convergence rates.

We derive a nonlinear integro-differential transport equation describing collective evolution of weights under gradient descent in large-width neural-network-like models. We characterize stationary points of the evolution and analyze several scenarios where the transport equation can be solved approximately. We test our general method in the special case of linear free-knot splines, and find good agreement between theory and experiment in observations of global optima, stability of stationary points, and convergence rates.

View on arXiv PDF

Similar