Convergence of neural networks to Gaussian mixture distribution
This provides theoretical insight into neural network behavior for researchers, but is incremental as it builds on existing infinite-width limit theories.
The paper proves that deep random neural networks converge to a Gaussian mixture distribution as the width of the last hidden layer approaches infinity, with experiments supporting this result and showing that increasing this width brings the distribution closer to a Gaussian mixture while other layers move it toward a normal distribution.
We give a proof that, under relatively mild conditions, fully-connected feed-forward deep random neural networks converge to a Gaussian mixture distribution as only the width of the last hidden layer goes to infinity. We conducted experiments for a simple model which supports our result. Moreover, it gives a detailed description of the convergence, namely, the growth of the last hidden layer gets the distribution closer to the Gaussian mixture, and the other layer successively get the Gaussian mixture closer to the normal distribution.