LGNEMLOct 1, 2019

The asymptotic spectrum of the Hessian of DNN throughout training

arXiv:1910.02875v239 citations
Originality Incremental advance
AI Analysis

This work addresses theoretical challenges in analyzing DNN optimization dynamics, offering incremental insights into Hessian behavior for researchers in machine learning theory.

The authors tackled the problem of understanding the Hessian spectrum of deep neural networks (DNNs) during training by leveraging the Neural Tangent Kernel (NTK). They characterized the full asymptotic spectrum when the NTK is fixed and described the first two moments in the mean-field limit, providing precise insights into Hessian dynamics.

The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs. When the NTK is fixed during training, we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training. In the so-called mean-field limit, where the NTK is not fixed during training, we describe the first two moments of the Hessian at initialization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes