The Restricted Isometry of ReLU Networks: Generalization through Norm Concentration
This work addresses generalization guarantees for neural networks, which is a foundational problem in machine learning, but it appears incremental as it builds on existing concentration and chaining techniques.
The paper tackles the problem of generalization in shallow ReLU networks by introducing the Neural Restricted Isometry Property (NeuRIP), a uniform concentration event that ensures all networks are sketched with the same quality using limited training data. It provides sample complexity bounds for achieving NeuRIP and shows that networks with small empirical risk generalize uniformly.
While regression tasks aim at interpolating a relation on the entire input space, they often have to be solved with a limited amount of training data. Still, if the hypothesis functions can be sketched well with the data, one can hope for identifying a generalizing model. In this work, we introduce with the Neural Restricted Isometry Property (NeuRIP) a uniform concentration event, in which all shallow $\mathrm{ReLU}$ networks are sketched with the same quality. To derive the sample complexity for achieving NeuRIP, we bound the covering numbers of the networks in the Sub-Gaussian metric and apply chaining techniques. In case of the NeuRIP event, we then provide bounds on the expected risk, which hold for networks in any sublevel set of the empirical risk. We conclude that all networks with sufficiently small empirical risk generalize uniformly.