LG OC MLJan 24, 2023

On Dynamic Regret and Constraint Violations in Constrained Online Convex Optimization

arXiv:2301.09808v19.86 citationsh-index: 21

Originality Incremental advance

AI Analysis

This work addresses optimization under constraints in online settings, which is incremental as it builds on existing OCO frameworks with new bounds.

The paper tackles the problem of constrained online convex optimization by proposing an algorithm that simultaneously minimizes dynamic regret and constraint violations, achieving bounds that scale with the path-length of optimal actions and proving these bounds are optimal.

A constrained version of the online convex optimization (OCO) problem is considered. With slotted time, for each slot, first an action is chosen. Subsequently the loss function and the constraint violation penalty evaluated at the chosen action point is revealed. For each slot, both the loss function as well as the function defining the constraint set is assumed to be smooth and strongly convex. In addition, once an action is chosen, local information about a feasible set within a small neighborhood of the current action is also revealed. An algorithm is allowed to compute at most one gradient at its point of choice given the described feedback to choose the next action. The goal of an algorithm is to simultaneously minimize the dynamic regret (loss incurred compared to the oracle's loss) and the constraint violation penalty (penalty accrued compared to the oracle's penalty). We propose an algorithm that follows projected gradient descent over a suitably chosen set around the current action. We show that both the dynamic regret and the constraint violation is order-wise bounded by the {\it path-length}, the sum of the distances between the consecutive optimal actions. Moreover, we show that the derived bounds are the best possible.

View on arXiv PDF

Similar