On the infinite-depth limit of finite-width neural networks
This work addresses theoretical understanding of neural network limits for researchers, but it is incremental as it builds on prior studies of infinite limits.
The paper investigates the infinite-depth limit of finite-width residual neural networks with random Gaussian weights, showing that pre-activations converge to a zero-drift diffusion process, with distributions varying by activation function and closed-form expressions in two cases, and revealing a regime change in post-activation norms as width increases from 3 to 4.
In this paper, we study the infinite-depth limit of finite-width residual neural networks with random Gaussian weights. With proper scaling, we show that by fixing the width and taking the depth to infinity, the pre-activations converge in distribution to a zero-drift diffusion process. Unlike the infinite-width limit where the pre-activation converge weakly to a Gaussian random variable, we show that the infinite-depth limit yields different distributions depending on the choice of the activation function. We document two cases where these distributions have closed-form (different) expressions. We further show an intriguing change of regime phenomenon of the post-activation norms when the width increases from 3 to 4. Lastly, we study the sequential limit infinite-depth-then-infinite-width and compare it with the more commonly studied infinite-width-then-infinite-depth limit.