Fundamental limits of overparametrized shallow neural networks for supervised learning

arXiv:2307.05635v111 citationsh-index: 24
Originality Highly original
AI Analysis

This provides rigorous theoretical limits for supervised learning with neural networks, addressing a foundational problem in machine learning theory.

The authors derived information-theoretic bounds on the mutual information and Bayes-optimal generalization error for overparametrized two-layer neural networks trained from data generated by a teacher network, establishing fundamental performance limits based on sample size, input dimension, and hidden units.

We carry out an information-theoretical analysis of a two-layer neural network trained from input-output pairs generated by a teacher network with matching architecture, in overparametrized regimes. Our results come in the form of bounds relating i) the mutual information between training data and network weights, or ii) the Bayes-optimal generalization error, to the same quantities but for a simpler (generalized) linear model for which explicit expressions are rigorously known. Our bounds, which are expressed in terms of the number of training samples, input dimension and number of hidden units, thus yield fundamental performance limits for any neural network (and actually any learning procedure) trained from limited data generated according to our two-layer teacher neural network model. The proof relies on rigorous tools from spin glasses and is guided by ``Gaussian equivalence principles'' lying at the core of numerous recent analyses of neural networks. With respect to the existing literature, which is either non-rigorous or restricted to the case of the learning of the readout weights only, our results are information-theoretic (i.e. are not specific to any learning algorithm) and, importantly, cover a setting where all the network parameters are trained.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes