On Rademacher Complexity-based Generalization Bounds for Deep Learning
This work provides improved theoretical guarantees for generalization in deep learning, particularly for CNNs with various activation functions, but is incremental as it builds on existing Rademacher complexity methods.
The paper tackled the problem of establishing non-vacuous generalization bounds for Convolutional Neural Networks (CNNs) using a Rademacher complexity-based framework, achieving an enhancement over prior results for ReLU-based Deep Neural Networks.
We show that the Rademacher complexity-based framework can establish non-vacuous generalization bounds for Convolutional Neural Networks (CNNs) in the context of classifying a small set of image classes. A key technical advancement is the formulation of novel contraction lemmas for high-dimensional mappings between vector spaces, specifically designed for general Lipschitz activation functions. These lemmas extend and refine the Talagrand contraction lemma across a broader range of scenarios. Our Rademacher complexity bound provides an enhancement over the results presented by Golowich et al. for ReLU-based Deep Neural Networks (DNNs). Moreover, while previous works utilizing Rademacher complexity have primarily focused on ReLU DNNs, our results generalize to a wider class of activation functions.