LG DS OC MLNov 20, 2024

Omnipredicting Single-Index Models with Multi-Index Models

arXiv:2411.13083v213.47 citationsh-index: 4STOC

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient omniprediction in learning theory, offering a more practical solution for supervised learning with broad loss functions, though it is incremental as it builds on existing methods like Isotron.

The paper tackles the problem of constructing omnipredictors for single-index models with high sample complexity and runtime in prior work, achieving a new construction that requires approximately ε⁻⁴ samples and runs in nearly-linear time, improving to ε⁻² for bi-Lipschitz link functions, compared to prior ε⁻¹⁰ samples.

Recent work on supervised learning [GKR+22] defined the notion of omnipredictors, i.e., predictor functions $p$ over features that are simultaneously competitive for minimizing a family of loss functions $\mathcal{L}$ against a comparator class $\mathcal{C}$. Omniprediction requires approximating the Bayes-optimal predictor beyond the loss minimization paradigm, and has generated significant interest in the learning theory community. However, even for basic settings such as agnostically learning single-index models (SIMs), existing omnipredictor constructions require impractically-large sample complexities and runtimes, and output complex, highly-improper hypotheses. Our main contribution is a new, simple construction of omnipredictors for SIMs. We give a learner outputting an omnipredictor that is $\varepsilon$-competitive on any matching loss induced by a monotone, Lipschitz link function, when the comparator class is bounded linear predictors. Our algorithm requires $\approx \varepsilon^{-4}$ samples and runs in nearly-linear time, and its sample complexity improves to $\approx \varepsilon^{-2}$ if link functions are bi-Lipschitz. This significantly improves upon the only prior known construction, due to [HJKRR18, GHK+23], which used $\gtrsim \varepsilon^{-10}$ samples. We achieve our construction via a new, sharp analysis of the classical Isotron algorithm [KS09, KKKS11] in the challenging agnostic learning setting, of potential independent interest. Previously, Isotron was known to properly learn SIMs in the realizable setting, as well as constant-factor competitive hypotheses under the squared loss [ZWDD24]. As they are based on Isotron, our omnipredictors are multi-index models with $\approx \varepsilon^{-2}$ prediction heads, bringing us closer to the tantalizing goal of proper omniprediction for general loss families and comparators.

View on arXiv PDF

Similar