LG MLMar 4

Oracle-efficient Hybrid Learning with Constrained Adversaries

Princewill Okoroafor, Robert Kleinberg, Michael P. Kim

arXiv:2603.04546v1

Originality Incremental advance

AI Analysis

This work provides an incremental step towards simultaneously achieving statistical optimality and computational efficiency in hybrid online learning for researchers and practitioners in online learning and game theory.

This paper addresses the Hybrid Online Learning Problem, where features are drawn i.i.d. and labels are adversarial, aiming for both statistical optimality and computational efficiency. The authors propose a new oracle-efficient learning algorithm for a structured setting where the adversary's label choices are constrained, achieving regret scaling with the Rademacher complexity of a class derived from the learner's and adversary's function classes.

The Hybrid Online Learning Problem, where features are drawn i.i.d. from an unknown distribution but labels are generated adversarially, is a well-motivated setting positioned between statistical and fully-adversarial online learning. Prior work has presented a dichotomy: algorithms that are statistically-optimal, but computationally intractable (Wu et al., 2023), and algorithms that are computationally-efficient (given an ERM oracle), but statistically-suboptimal (Wu et al., 2024). This paper takes a significant step towards achieving statistical optimality and computational efficiency simultaneously in the Hybrid Learning setting. To do so, we consider a structured setting, where the Adversary is constrained to pick labels from an expressive, but fixed, class of functions $R$. Our main result is a new learning algorithm, which runs efficiently given an ERM oracle and obtains regret scaling with the Rademacher complexity of a class derived from the Learner's hypothesis class $H$ and the Adversary's label class $R$. As a key corollary, we give an oracle-efficient algorithm for computing equilibria in stochastic zero-sum games when action sets may be high-dimensional but the payoff function exhibits a type of low-dimensional structure. Technically, we develop a number of tools for the design and analysis of our learning algorithm, including a novel Frank-Wolfe reduction with "truncated entropy regularizer" and a new tail bound for sums of "hybrid" martingale difference sequences.

View on arXiv PDF

Similar