Lower Bounds on the Generalization Error of Nonlinear Learning Models
This work addresses the theoretical understanding of generalization in deep learning for researchers, providing foundational insights into error bounds in high-dimensional regimes, though it is incremental as it builds on existing theory.
The paper tackles the problem of deriving lower bounds on generalization error for nonlinear learning models, specifically multi-layer neural networks, in the regime where layer size is comparable to sample size, showing that unbiased estimators perform poorly and providing explicit bounds for biased estimators in linear regression and two-layered networks, with the linear bound being asymptotically tight.
We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data. We show that unbiased estimators have unacceptable performance for such nonlinear networks in this regime. We derive explicit generalization lower bounds for general biased estimators, in the cases of linear regression and of two-layered networks. In the linear case the bound is asymptotically tight. In the nonlinear case, we provide a comparison of our bounds with an empirical study of the stochastic gradient descent algorithm. The analysis uses elements from the theory of large random matrices.