Implicit Regularization via Neural Feature Alignment
This provides insights into why deep learning models generalize well, addressing a foundational issue for machine learning researchers, though it is incremental in nature.
The paper tackles the problem of implicit regularization in deep learning by analyzing it from a geometrical perspective, showing that neural tangent features dynamically align along task-relevant directions, which acts as a feature selection and compression mechanism.
We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a regularization effect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions. This can be interpreted as a combined mechanism of feature selection and compression. By extrapolating a new analysis of Rademacher complexity bounds for linear models, we motivate and study a heuristic complexity measure that captures this phenomenon, in terms of sequences of tangent kernel classes along optimization paths.