LGMLOct 22, 2020

Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity

arXiv:2010.11775v225 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical limitation in modeling neural network training dynamics for machine learning researchers, but it is incremental as it builds on existing NTK frameworks.

The paper tackles the performance gap between neural tangent kernels (NTK) and real-world neural networks in generalization and local elasticity by introducing label-aware kernels, showing through theory and experiments that these kernels better simulate neural networks.

As a popular approach to modeling the dynamics of training overparametrized neural networks (NNs), the neural tangent kernels (NTK) are known to fall behind real-world NNs in generalization ability. This performance gap is in part due to the \textit{label agnostic} nature of the NTK, which renders the resulting kernel not as \textit{locally elastic} as NNs~\citep{he2019local}. In this paper, we introduce a novel approach from the perspective of \emph{label-awareness} to reduce this gap for the NTK. Specifically, we propose two label-aware kernels that are each a superimposition of a label-agnostic part and a hierarchy of label-aware parts with increasing complexity of label dependence, using the Hoeffding decomposition. Through both theoretical and empirical evidence, we show that the models trained with the proposed kernels better simulate NNs in terms of generalization ability and local elasticity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes