A path-norm toolkit for modern networks: consequences, promises and challenges
This work provides a versatile toolkit for researchers in machine learning to compute and apply path-norm-based generalization bounds to complex neural network architectures, though it is incremental in extending existing path-norm methods.
The authors tackled the problem of establishing generalization bounds for modern neural networks by introducing a comprehensive toolkit for path-norms that applies to DAG ReLU networks with biases, skip connections, and operations like max pooling, recovering or beating the sharpest known bounds of this type.
This work introduces the first toolkit around path-norms that fully encompasses general DAG ReLU networks with biases, skip connections and any operation based on the extraction of order statistics: max pooling, GroupSort etc. This toolkit notably allows us to establish generalization bounds for modern neural networks that are not only the most widely applicable path-norm based ones, but also recover or beat the sharpest known bounds of this type. These extended path-norms further enjoy the usual benefits of path-norms: ease of computation, invariance under the symmetries of the network, and improved sharpness on layered fully-connected networks compared to the product of operator norms, another complexity measure most commonly used. The versatility of the toolkit and its ease of implementation allow us to challenge the concrete promises of path-norm-based generalization bounds, by numerically evaluating the sharpest known bounds for ResNets on ImageNet.