Interplay between depth of neural networks and locality of target functions
This work addresses a theoretical gap in understanding why overparameterized deep neural networks generalize well, which is important for researchers in machine learning theory.
The paper investigates how the depth of neural networks affects learning based on the locality of target functions, finding that depth helps learn local functions but hinders learning global functions, with this effect not captured by the neural tangent kernel.
It has been recognized that heavily overparameterized deep neural networks (DNNs) exhibit surprisingly good generalization performance in various machine-learning tasks. Although benefits of depth have been investigated from different perspectives such as the approximation theory and the statistical learning theory, existing theories do not adequately explain the empirical success of overparameterized DNNs. In this work, we report a remarkable interplay between depth and locality of a target function. We introduce $k$-local and $k$-global functions, and find that depth is beneficial for learning local functions but detrimental to learning global functions. This interplay is not properly captured by the neural tangent kernel, which describes an infinitely wide neural network within the lazy learning regime.