On the Cone Effect in the Learning Dynamics
This work addresses a central topic in deep learning for researchers, providing incremental insights into learning dynamics by identifying a specific phenomenon.
The paper tackles the problem of understanding neural network learning dynamics by empirically studying the evolution of the empirical Neural Tangent Kernel (eNTK) during training, revealing a two-phase process with a 'cone effect' in Phase II that shows significant performance advantages over linearized training.
Understanding the learning dynamics of neural networks is a central topic in the deep learning community. In this paper, we take an empirical perspective to study the learning dynamics of neural networks in real-world settings. Specifically, we investigate the evolution process of the empirical Neural Tangent Kernel (eNTK) during training. Our key findings reveal a two-phase learning process: i) in Phase I, the eNTK evolves significantly, signaling the rich regime, and ii) in Phase II, the eNTK keeps evolving but is constrained in a narrow space, a phenomenon we term the cone effect. This two-phase framework builds on the hypothesis proposed by Fort et al. (2020), but we uniquely identify the cone effect in Phase II, demonstrating its significant performance advantages over fully linearized training.