LGNov 15, 2022

Characterizing the Spectrum of the NTK via a Power Series Expansion

arXiv:2211.07844v422 citationsh-index: 28
Originality Incremental advance
AI Analysis

This work provides theoretical insights into NTK properties for deep learning practitioners, but it is incremental as it builds on existing NTK theory.

The authors derived a power series expansion for the Neural Tangent Kernel (NTK) of deep feedforward networks in the infinite width limit, relating its effective rank to the input-data Gram and analyzing eigenvalues for spherical data, with an asymptotic upper bound on the spectrum for generic data.

Under mild conditions on the network initialization we derive a power series expansion for the Neural Tangent Kernel (NTK) of arbitrarily deep feedforward networks in the infinite width limit. We provide expressions for the coefficients of this power series which depend on both the Hermite coefficients of the activation function as well as the depth of the network. We observe faster decay of the Hermite coefficients leads to faster decay in the NTK coefficients and explore the role of depth. Using this series, first we relate the effective rank of the NTK to the effective rank of the input-data Gram. Second, for data drawn uniformly on the sphere we study the eigenvalues of the NTK, analyzing the impact of the choice of activation function. Finally, for generic data and activation functions with sufficiently fast Hermite coefficient decay, we derive an asymptotic upper bound on the spectrum of the NTK.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes