Tuning Universality in Deep Neural Networks
This work addresses the fundamental problem of understanding collective dynamics in deep neural networks for researchers in theoretical machine learning and statistical physics, though it appears incremental as it builds on existing stochastic and percolation theories.
The paper tackled the lack of mechanistic explanation for crackling-like avalanches in deep neural networks by deriving a stochastic theory of deep information propagation, which identified four effective couplings that control avalanche dynamics and universality classes, with numerical simulations confirming that activation function design governs collective dynamics in random DNNs.
Deep neural networks (DNNs) exhibit crackling-like avalanches whose origin lacks a mechanistic explanation. Here, I derive a stochastic theory of deep information propagation (DIP) by incorporating Central Limit Theorem (CLT)-level fluctuations. Four effective couplings $(r, h, D_1, D_2)$ characterize the dynamics, yielding a Landau description of the static exponents and a Directed Percolation (DP) structure of activity cascades. Tuning the couplings selects between avalanche dynamics generated by a Brownian Motion (BM) in a logarithmic trap and an absorbed free BM, each corresponding to a distinct universality classes. Numerical simulations confirm the theory and demonstrate that activation function design controls the collective dynamics in random DNNs.