LGMLMar 13, 2023

Bayes Complexity of Learners vs Overfitting

arXiv:2303.07874v12 citationsh-index: 47
Originality Incremental advance
AI Analysis

This addresses the problem of understanding and quantifying generalization in machine learning, particularly for neural networks, by providing a unified complexity notion, though it appears incremental in building on existing PAC-Bayes and complexity theory.

The paper introduces a new complexity measure for functions that governs a PAC-Bayes generalization bound, relates to natural complexity notions like variation for neural networks, and explains generalization gaps between neural networks and linear schemes. It shows a separation in sample complexity for good generalization between 2 and 4-layer neural networks on periodic functions.

We introduce a new notion of complexity of functions and we show that it has the following properties: (i) it governs a PAC Bayes-like generalization bound, (ii) for neural networks it relates to natural notions of complexity of functions (such as the variation), and (iii) it explains the generalization gap between neural networks and linear schemes. While there is a large set of papers which describes bounds that have each such property in isolation, and even some that have two, as far as we know, this is a first notion that satisfies all three of them. Moreover, in contrast to previous works, our notion naturally generalizes to neural networks with several layers. Even though the computation of our complexity is nontrivial in general, an upper-bound is often easy to derive, even for higher number of layers and functions with structure, such as period functions. An upper-bound we derive allows to show a separation in the number of samples needed for good generalization between 2 and 4-layer neural networks for periodic functions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes