LGOCMay 15, 2024

Tight Bounds for Online Convex Optimization with Adversarial Constraints

arXiv:2405.09296v13 citationsh-index: 3
Originality Highly original
AI Analysis

This solves a foundational theoretical problem in online optimization with broad implications for machine learning and control systems.

The paper tackles the problem of constrained online convex optimization (COCO) by designing an online policy that simultaneously achieves O(√T) regret and Õ(√T) cumulative constraint violation against an adaptive adversary, answering a long-standing open question.

A well-studied generalization of the standard online convex optimization (OCO) is constrained online convex optimization (COCO). In COCO, on every round, a convex cost function and a convex constraint function are revealed to the learner after the action for that round is chosen. The objective is to design an online policy that simultaneously achieves a small regret while ensuring small cumulative constraint violation (CCV) against an adaptive adversary. A long-standing open question in COCO is whether an online policy can simultaneously achieve $O(\sqrt{T})$ regret and $O(\sqrt{T})$ CCV without any restrictive assumptions. For the first time, we answer this in the affirmative and show that an online policy can simultaneously achieve $O(\sqrt{T})$ regret and $\tilde{O}(\sqrt{T})$ CCV. We establish this result by effectively combining the adaptive regret bound of the AdaGrad algorithm with Lyapunov optimization - a classic tool from control theory. Surprisingly, the analysis is short and elegant.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes