Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
This position paper advocates for a foundational approach to accelerate deep learning science by focusing on solvable models, which could benefit researchers studying neural dynamics, though it is incremental in building on existing simplification methods.
The authors propose using layerwise linear models as simplified representations to understand complex neural network dynamics, such as neural collapse and grokking, by applying the dynamical feedback principle that explains how layers interact and evolve.
In physics, complex systems are often simplified into minimal, solvable models that retain only the core principles. In machine learning, layerwise linear models (e.g., linear neural networks) act as simplified representations of neural network dynamics. These models follow the dynamical feedback principle, which describes how layers mutually govern and amplify each other's evolution. This principle extends beyond the simplified models, successfully explaining a wide range of dynamical phenomena in deep neural networks, including neural collapse, emergence, lazy and rich regimes, and grokking. In this position paper, we call for the use of layerwise linear models retaining the core principles of neural dynamical phenomena to accelerate the science of deep learning.