Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability
This work addresses a gap in understanding NTK eigenvector behavior during EoS, which is incremental but relevant for researchers studying training dynamics in deep learning.
The paper investigates the dynamics of Neural Tangent Kernel (NTK) eigenvectors during the Edge of Stability (EoS) phenomenon in deep learning, finding that larger learning rates increase alignment between leading eigenvectors and training targets across architectures, with theoretical analysis for a two-layer linear network.
The study of Neural Tangent Kernels (NTKs) in deep learning has drawn increasing attention in recent years. NTKs typically actively change during training and are related to feature learning. In parallel, recent work on Gradient Descent (GD) has found a phenomenon called Edge of Stability (EoS), in which the largest eigenvalue of the NTK oscillates around a value inversely proportional to the step size. However, although follow-up works have explored the underlying mechanism of such eigenvalue behavior in depth, the understanding of the behavior of the NTK eigenvectors during EoS is still missing. This paper examines the dynamics of NTK eigenvectors during EoS in detail. Across different architectures, we observe that larger learning rates cause the leading eigenvectors of the final NTK, as well as the full NTK matrix, to have greater alignment with the training target. We then study the underlying mechanism of this phenomenon and provide a theoretical analysis for a two-layer linear network. Our study enhances the understanding of GD training dynamics in deep learning.