Universal Scaling Laws of Absorbing Phase Transitions in Artificial Deep Neural Networks

arXiv:2307.02284v36 citationsh-index: 22
Originality Highly original
AI Analysis

This provides a theoretical framework connecting critical phenomena to deep learning behavior, potentially offering insights into network design and generalization.

The authors demonstrated that deep neural networks near the edge of chaos exhibit universal scaling laws from absorbing phase transitions, with multilayer perceptrons and convolutional neural networks belonging to different universality classes. Their analysis revealed that hyperparameter tuning to the phase boundary is necessary but insufficient for optimal generalization, with nonuniversal metric factors playing a significant role.

We demonstrate that conventional artificial deep neural networks operating near the phase boundary of the signal propagation dynamics, also known as the edge of chaos, exhibit universal scaling laws of absorbing phase transitions in non-equilibrium statistical mechanics. We exploit the fully deterministic nature of the propagation dynamics to elucidate an analogy between a signal collapse in the neural networks and an absorbing state (a state that the system can enter but cannot escape from). Our numerical results indicate that the multilayer perceptrons and the convolutional neural networks belong to the mean-field and the directed percolation universality classes, respectively. Also, the finite-size scaling is successfully applied, suggesting a potential connection to the depth-width trade-off in deep learning. Furthermore, our analysis of the training dynamics under the gradient descent reveals that hyperparameter tuning to the phase boundary is necessary but insufficient for achieving optimal generalization in deep networks. Remarkably, nonuniversal metric factors associated with the scaling laws are shown to play a significant role in concretizing the above observations. These findings highlight the usefulness of the notion of criticality for analyzing the behavior of artificial deep neural networks and offer new insights toward a unified understanding of the essential relationship between criticality and intelligence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes