MLLGMay 29, 2019

On the Inductive Bias of Neural Tangent Kernels

arXiv:1905.12173v2311 citations
Originality Synthesis-oriented
AI Analysis

This work provides theoretical insights into why gradient descent generalizes well in over-parameterized regimes, which is important for understanding deep learning optimization.

The paper analyzes the inductive bias of neural tangent kernels (NTKs) in over-parameterized neural networks, examining properties like smoothness, approximation, and stability, including deformation stability in convolutional networks, and compares them to other kernels.

State-of-the-art neural networks are heavily over-parameterized, making the optimization algorithm a crucial ingredient for learning predictive models with good generalization properties. A recent line of work has shown that in a certain over-parameterized regime, the learning dynamics of gradient descent are governed by a certain kernel obtained at initialization, called the neural tangent kernel. We study the inductive bias of learning in such a regime by analyzing this kernel and the corresponding function space (RKHS). In particular, we study smoothness, approximation, and stability properties of functions with finite norm, including stability to image deformations in the case of convolutional networks, and compare to other known kernels for similar architectures.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes