LGMay 22, 2023

Tight conditions for when the NTK approximation is valid

arXiv:2305.13141v31 citations
Originality Incremental advance
AI Analysis

This provides a theoretical foundation for understanding NTK behavior in deep learning, though it is incremental as it refines existing bounds.

The paper establishes tight conditions for when the neural tangent kernel (NTK) approximation holds in lazy training with square loss, showing that a rescaling factor of α = O(T) suffices, improving the previous bound of α = O(T²).

We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss. In the lazy training setting of Chizat et al. 2019, we show that rescaling the model by a factor of $α= O(T)$ suffices for the NTK approximation to be valid until training time $T$. Our bound is tight and improves on the previous bound of Chizat et al. 2019, which required a larger rescaling factor of $α= O(T^2)$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes