LG OCFeb 11

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

arXiv:2602.10565v11.4h-index: 12

Originality Incremental advance

AI Analysis

This work addresses online optimization challenges in game theory and machine learning, offering incremental improvements by adapting existing frameworks to min-max settings.

The paper tackles the problem of online min-max optimization by proposing new performance measures and algorithms to achieve bounds for static duality gaps and dynamic saddle point regrets under various conditions like strong convexity-strong concavity and min-max exponential concavity, with results including bounds for a two-player portfolio selection variant.

We propose and study an online version of min-max optimization based on cumulative saddle points under a variety of performance measures beyond convex-concave settings. After first observing the incompatibility of (static) Nash equilibrium (SNE-Reg$_T$) with individual regrets even for strongly convex-strongly concave functions, we propose an alternate \emph{static} duality gap (SDual-Gap$_T$) inspired by the online convex optimization (OCO) framework. We provide algorithms that, using a reduction to classic OCO problems, achieve bounds for SDual-Gap$_T$~and a novel \emph{dynamic} saddle point regret (DSP-Reg$_T$), which we suggest naturally represents a min-max version of the dynamic regret in OCO. We derive our bounds for SDual-Gap$_T$~and DSP-Reg$_T$~under strong convexity-strong concavity and a min-max notion of exponential concavity (min-max EC), and in addition we establish a class of functions satisfying min-max EC~that captures a two-player variant of the classic portfolio selection problem. Finally, for a dynamic notion of regret compatible with individual regrets, we derive bounds under a two-sided Polyak-Łojasiewicz (PL) condition.

View on arXiv PDF

Similar