On Size-Independent Sample Complexity of ReLU Networks
This addresses theoretical generalization guarantees for neural networks, but it is incremental as it builds on previous bounds.
The paper tackles the problem of sample complexity for learning ReLU neural networks by refining existing bounds to often eliminate explicit depth-dependence, improving upon prior work that included a square-root depth factor.
We study the sample complexity of learning ReLU neural networks from the point of view of generalization. Given norm constraints on the weight matrices, a common approach is to estimate the Rademacher complexity of the associated function class. Previously Golowich-Rakhlin-Shamir (2020) obtained a bound independent of the network size (scaling with a product of Frobenius norms) except for a factor of the square-root depth. We give a refinement which often has no explicit depth-dependence at all.