LGFAMar 7

Margin in Abstract Spaces

arXiv:2603.07221v1
Predicted impact top 59% in LG · last 90 daysOriginality Highly original
AI Analysis

This work provides foundational insights into the generalization properties of margin-based learning for machine learning theorists and researchers, particularly concerning its independence from parameter count in over-parameterized settings.

This paper investigates the minimal mathematical structure required for margin-based learning, showing that for certain distance-based concepts, learnability in arbitrary metric spaces depends only on the triangle inequality when margins are sufficiently large (R>3r). They further demonstrate a sharp margin threshold for learnability of bounded linear combinations of distance functions in any metric space and establish that margin-based learnability cannot always be reduced to linear classification in a Banach space.

Margin-based learning, exemplified by linear and kernel methods, is one of the few classical settings where generalization guarantees are independent of the number of parameters. This makes it a central case study in modern highly over-parameterized learning. We ask what minimal mathematical structure underlies this phenomenon. We begin with a simple margin-based problem in arbitrary metric spaces: concepts are defined by a center point and classify points according to whether their distance lies below $r$ or above $R$. We show that whenever $R>3r$, this class is learnable in \emph{any} metric space. Thus, sufficiently large margins make learnability depend only on the triangle inequality, without any linear or analytic structure. Our first main result extends this phenomenon to concepts defined by bounded linear combinations of distance functions, and reveals a sharp threshold: there exists a universal constant $γ>0$ such that above this margin the class is learnable in every metric space, while below it there exist metric spaces where it is not learnable at all. We then ask whether margin-based learnability can always be explained via an embedding into a linear space -- that is, reduced to linear classification in some Banach space through a kernel-type construction. We answer this negatively by developing a structural taxonomy of Banach spaces: if a Banach space is learnable for some margin parameter $γ\geq 0$, then it is learnable for all such $γ$, and in infinite-dimensional spaces the sample complexity must scale polynomially in $1/γ$. Specifically, it must grow as $(1/γ)^p$ for some $p\ge 2$, and every such rate is attainable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes