LGCCMay 13

Polyhedral Instability Governs Regret in Online Learning

arXiv:2605.1369251.4
AI Analysis

For researchers in online learning and combinatorial optimization, it provides a unified geometric explanation of regret scaling, though the result is incremental as it builds on known convex relaxation frameworks.

The paper shows that regret in online learning over combinatorial actions is governed by polyhedral instability (number of region switches), proving a tight regret bound that interpolates between experts-like and dimension-dependent rates, and validates the theory on shortest path and influence maximization problems.

Many online decision problems over combinatorial actions are addressed via convex relaxations, leading to online convex optimization with piecewise linear objectives and induced polyhedral structure. We show that regret in such problems is governed by \emph{polyhedral instability}: the number of changes of the active region. Under full information feedback and fixed partition assumptions, if $\mathrm{RS}_T$ denotes the number of region switches and $V_{\max}$ the maximum number of vertices per region, we prove $\Regret_T= Θ(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$ interpolating between experts-like and dimension-dependent OCO rates. For online submodular--concave games under Lovász convexification, this reduces to the permutation-switch count $\mathrm{SC}_T$, yielding the matching rate $\Regret_T= Θ(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$. Experiments on synthetic and real combinatorial problems (shortest path, influence maximization) validate the predicted scaling and indicate that low-instability regimes can arise in practice without explicit enumeration of actions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes