LGAIMLMay 23, 2024

Understanding the dynamics of the frequency bias in neural networks

arXiv:2405.14957v15 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses the frequency bias problem in neural network training, which is incremental as it builds on prior observations to provide theoretical insights and practical initialization strategies.

The study tackled the frequency bias in neural networks by developing a PDE to model error dynamics in a 2-layer network under the Neural Tangent Kernel regime, showing that specific weight initialization distributions can control or eliminate this bias, with experimental validation on Fourier Features models and extension to multi-layer networks.

Recent works have shown that traditional Neural Network (NN) architectures display a marked frequency bias in the learning process. Namely, the NN first learns the low-frequency features before learning the high-frequency ones. In this study, we rigorously develop a partial differential equation (PDE) that unravels the frequency dynamics of the error for a 2-layer NN in the Neural Tangent Kernel regime. Furthermore, using this insight, we explicitly demonstrate how an appropriate choice of distributions for the initialization weights can eliminate or control the frequency bias. We focus our study on the Fourier Features model, an NN where the first layer has sine and cosine activation functions, with frequencies sampled from a prescribed distribution. In this setup, we experimentally validate our theoretical results and compare the NN dynamics to the solution of the PDE using the finite element method. Finally, we empirically show that the same principle extends to multi-layer NNs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes