LGOct 15, 2020

On the exact computation of linear frequency principle dynamics and its generalization

arXiv:2010.08153v10.0026 citations
AI Analysis85

This provides theoretical insight into why deep neural networks generalize well for low-frequency functions, addressing a fundamental problem in machine learning theory.

The authors derived an exact linear frequency-principle (LFP) model to describe how neural networks evolve from low to high frequencies during training, showing that higher frequencies evolve slower depending on activation smoothness, and proved a generalization error bound controlled by a frequency-principle norm.

Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks. In this paper, through analysis of an infinite-width two-layer NN in the neural tangent kernel (NTK) regime, we derive the exact differential equation, namely Linear Frequency-Principle (LFP) model, governing the evolution of NN output function in the frequency domain during the training. Our exact computation applies for general activation functions with no assumption on size and distribution of training data. This LFP model unravels that higher frequencies evolve polynomially or exponentially slower than lower frequencies depending on the smoothness/regularity of the activation function. We further bridge the gap between training dynamics and generalization by proving that LFP model implicitly minimizes a Frequency-Principle norm (FP-norm) of the learned function, by which higher frequencies are more severely penalized depending on the inverse of their evolution rate. Finally, we derive an \textit{a priori} generalization error bound controlled by the FP-norm of the target function, which provides a theoretical justification for the empirical results that DNNs often generalize well for low frequency functions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes