LGOCMLJun 21, 2019

Theory of the Frequency Principle for General Deep Neural Networks

arXiv:1906.09235v2101 citations
Originality Incremental advance
AI Analysis

This work offers a theoretical foundation for understanding DNN training dynamics, which is incremental as it builds on prior empirical studies of the F-Principle.

The paper rigorously investigates the Frequency Principle (F-Principle) in deep neural networks, providing theorems for three training stages to explain how DNNs learn from low to high frequencies, with results applicable to general network architectures, activation functions, and loss functions.

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes