LGLOLOPRMLOct 14, 2024

Measurability in the Fundamental Theorem of Statistical Learning

arXiv:2410.10243v36 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses foundational theoretical issues in statistical learning theory, with implications for rigorous proofs and applications in model theory and neural networks.

The paper scrutinizes the measurability assumptions in proofs of the Fundamental Theorem of Statistical Learning for agnostic PAC learning, providing a rigorous statement and proof with minimal requirements, and applies this to show PAC learnability of hypothesis spaces over o-minimal expansions of the reals, covering neural networks with ReLU and sigmoid activations.

The Fundamental Theorem of Statistical Learning states that a hypothesis space is PAC learnable if and only if its VC dimension is finite. For the agnostic model of PAC learning, the literature so far presents proofs of this theorem that often tacitly impose several measurability assumptions on the involved sets and functions. We scrutinize these proofs from a measure-theoretic perspective in order to explicitly extract the assumptions needed for a rigorous argument. This leads to a sound statement as well as a detailed and self-contained proof of the Fundamental Theorem of Statistical Learning in the agnostic setting, showcasing the minimal measurability requirements needed. As the Fundamental Theorem of Statistical Learning underpins a wide range of further theoretical developments, our results are of foundational importance: A careful analysis of measurability aspects is essential, especially when the theorem is used in settings where measure-theoretic subtleties play a role. We particularly discuss applications in Model Theory, considering NIP and o-minimal structures. Our main theorem presents sufficient conditions for the PAC learnability of hypothesis spaces defined over o-minimal expansions of the reals. This class of hypothesis spaces covers all artificial neural networks for binary classification that use commonly employed activation functions like ReLU and the sigmoid function.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes