LGNov 5, 2025

Depth-induced NTK: Bridging Over-parameterized Neural Networks and Deep Neural Kernels

arXiv:2511.05585v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap in understanding deep learning representation for researchers, though it appears incremental by extending NTK theory to incorporate depth effects.

The paper tackles the limitation of existing neural tangent kernels (NTKs) in overlooking network depth by proposing a depth-induced NTK kernel that converges to a Gaussian process as depth approaches infinity, with theoretical analysis showing stabilized kernel dynamics and mitigated degeneration.

While deep learning has achieved remarkable success across a wide range of applications, its theoretical understanding of representation learning remains limited. Deep neural kernels provide a principled framework to interpret over-parameterized neural networks by mapping hierarchical feature transformations into kernel spaces, thereby combining the expressive power of deep architectures with the analytical tractability of kernel methods. Recent advances, particularly neural tangent kernels (NTKs) derived by gradient inner products, have established connections between infinitely wide neural networks and nonparametric Bayesian inference. However, the existing NTK paradigm has been predominantly confined to the infinite-width regime, while overlooking the representational role of network depth. To address this gap, we propose a depth-induced NTK kernel based on a shortcut-related architecture, which converges to a Gaussian process as the network depth approaches infinity. We theoretically analyze the training invariance and spectrum properties of the proposed kernel, which stabilizes the kernel dynamics and mitigates degeneration. Experimental results further underscore the effectiveness of our proposed method. Our findings significantly extend the existing landscape of the neural kernel theory and provide an in-depth understanding of deep learning and the scaling law.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes