LGAISTMay 25, 2022

On Bridging the Gap between Mean Field and Finite Width in Deep Random Neural Networks with Batch Normalization

arXiv:2205.13076v32 citationsh-index: 108
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap in understanding deep neural network behavior for researchers in machine learning theory, though it appears incremental as it builds on existing mean-field theory.

The paper tackles the problem of error amplification in mean-field predictions for deep neural networks by analyzing the role of batch normalization (BN) at initialization, showing that BN stabilizes representations and enables concentration bounds even in infinitely-deep networks with finite width.

Mean field theory is widely used in the theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions, specifically for deep multilayer perceptron (MLP) with batch normalization (BN) at initialization. By scaling the network width to infinity, it is postulated that the mean-field predictions suffer from layer-wise errors that amplify with depth. We demonstrate that BN stabilizes the distribution of representations that avoids the error propagation of mean-field predictions. This stabilization, which is characterized by a geometric mixing property, allows us to establish concentration bounds for mean field predictions in infinitely-deep neural networks with a finite width.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes