LGMLMay 30, 2018

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks

arXiv:1805.12076v1401 citations
Originality Incremental advance
AI Analysis

This work addresses a foundational theoretical gap in understanding generalization for neural networks, which is incremental as it builds on existing complexity measures.

The authors tackled the problem of explaining why over-parametrized neural networks generalize better by proposing a novel complexity measure based on unit-wise capacities, resulting in a tighter generalization bound for two-layer ReLU networks that correlates with test error behavior.

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes, and could potentially explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes