$O(\sqrt{T})$ Static Regret and Instance Dependent Constraint Violation for Constrained Online Convex Optimization
This work addresses the problem of balancing regret and constraint violations in online learning for applications like resource allocation, offering an instance-dependent improvement over prior universal results.
The paper tackles constrained online convex optimization by proposing an algorithm that achieves O(√T) static regret and an instance-dependent cumulative constraint violation of min{V, O(√T log T)}, where V depends on problem parameters, improving over universal O(√T log T) bounds.
The constrained version of the standard online convex optimization (OCO) framework, called COCO is considered, where on every round, a convex cost function and a convex constraint function are revealed to the learner after it chooses the action for that round. The objective is to simultaneously minimize the static regret and cumulative constraint violation (CCV). An algorithm is proposed that guarantees a static regret of $O(\sqrt{T})$ and a CCV of $\min\{\cV, O(\sqrt{T}\log T) \}$, where $\cV$ depends on the distance between the consecutively revealed constraint sets, the shape of constraint sets, dimension of action space and the diameter of the action space. For special cases of constraint sets, $\cV=O(1)$. Compared to the state of the art results, static regret of $O(\sqrt{T})$ and CCV of $O(\sqrt{T}\log T)$, that were universal, the new result on CCV is instance dependent, which is derived by exploiting the geometric properties of the constraint sets.