MLLGMay 11, 2023

Generalization bounds for neural ordinary differential equations and deep residual networks

arXiv:2305.06648v229 citations
Originality Incremental advance
AI Analysis

This work provides theoretical insights into generalization for continuous-depth models, which is incremental as it builds on existing frameworks.

The authors derived a generalization bound for neural ordinary differential equations and deep residual networks using a Lipschitz-based argument, focusing on the magnitude of differences between successive weight matrices and illustrating its effect numerically.

Neural ordinary differential equations (neural ODEs) are a popular family of continuous-depth deep learning models. In this work, we consider a large family of parameterized ODEs with continuous-in-time parameters, which include time-dependent neural ODEs. We derive a generalization bound for this class by a Lipschitz-based argument. By leveraging the analogy between neural ODEs and deep residual networks, our approach yields in particular a generalization bound for a class of deep residual networks. The bound involves the magnitude of the difference between successive weight matrices. We illustrate numerically how this quantity affects the generalization capability of neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes