LG OC MLJun 21, 2019

Theory of the Frequency Principle for General Deep Neural Networks

Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

arXiv:1906.09235v220.5101 citations

Originality Incremental advance

AI Analysis

This work offers a theoretical foundation for understanding DNN training dynamics, which is incremental as it builds on prior empirical studies of the F-Principle.

The paper rigorously investigates the Frequency Principle (F-Principle) in deep neural networks, providing theorems for three training stages to explain how DNNs learn from low to high frequencies, with results applicable to general network architectures, activation functions, and loss functions.

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.

View on arXiv PDF

Similar