LG HEP-TH MLJun 11, 2020

On the asymptotics of wide networks with polynomial activations

arXiv:2006.06687v113.624 citations

Originality Synthesis-oriented

AI Analysis

This work addresses theoretical foundations for understanding wide neural networks, which is incremental but important for researchers in machine learning theory.

The authors proved an existing conjecture about the asymptotic behavior of neural networks in the large width limit for deep networks with polynomial activation functions, extending results on tight bounds during stochastic gradient descent and finite-width dynamics, and highlighted differences in asymptotic behavior between analytic and piecewise-linear activations.

We consider an existing conjecture addressing the asymptotic behavior of neural networks in the large width limit. The results that follow from this conjecture include tight bounds on the behavior of wide networks during stochastic gradient descent, and a derivation of their finite-width dynamics. We prove the conjecture for deep networks with polynomial activation functions, greatly extending the validity of these results. Finally, we point out a difference in the asymptotic behavior of networks with analytic (and non-linear) activation functions and those with piecewise-linear activations such as ReLU.

View on arXiv PDF

Similar