LGFeb 24

On the Generalization Behavior of Deep Residual Networks From a Dynamical System Perspective

arXiv:2602.20921v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the theoretical gap in generalization analysis for ResNets, offering a unified framework that benefits researchers in machine learning theory.

The paper tackles the problem of understanding generalization in deep residual networks by establishing generalization error bounds for both discrete- and continuous-time ResNets, resulting in bounds of order O(1/√S) with a structure-dependent negative term that provides depth-uniform and asymptotic bounds under milder assumptions.

Deep neural networks (DNNs) have significantly advanced machine learning, with model depth playing a central role in their successes. The dynamical system modeling approach has recently emerged as a powerful framework, offering new mathematical insights into the structure and learning behavior of DNNs. In this work, we establish generalization error bounds for both discrete- and continuous-time residual networks (ResNets) by combining Rademacher complexity, flow maps of dynamical systems, and the convergence behavior of ResNets in the deep-layer limit. The resulting bounds are of order $O(1/\sqrt{S})$ with respect to the number of training samples $S$, and include a structure-dependent negative term, yielding depth-uniform and asymptotic generalization bounds under milder assumptions. These findings provide a unified understanding of generalization across both discrete- and continuous-time ResNets, helping to close the gap in both the order of sample complexity and assumptions between the discrete- and continuous-time settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes