LG OC MLJan 31, 2023

Unconstrained Dynamic Regret via Sparse Coding

Zhiyu Zhang, Ashok Cutkosky, Ioannis Ch. Paschalidis

Harvard

arXiv:2301.13349v514.313 citationsh-index: 41

Originality Incremental advance

AI Analysis

This work addresses the problem of adaptive regret bounds for researchers in online learning, offering a versatile approach with specific improvements over prior methods, though it is incremental in nature.

The paper tackles the challenge of nonstationarity in online convex optimization with unbounded domains and arbitrary comparator sequences by introducing a sparse coding framework that achieves adaptive regret bounds based on comparator complexity, improving the state-of-the-art by adapting to the comparator's average magnitude and variability rather than maximum norms.

Motivated by the challenge of nonstationarity in sequential decision making, we study Online Convex Optimization (OCO) under the coupling of two problem structures: the domain is unbounded, and the comparator sequence $u_1,\ldots,u_T$ is arbitrarily time-varying. As no algorithm can guarantee low regret simultaneously against all comparator sequences, handling this setting requires moving from minimax optimality to comparator adaptivity. That is, sensible regret bounds should depend on certain complexity measures of the comparator relative to one's prior knowledge. This paper achieves a new type of these adaptive regret bounds via a sparse coding framework. The complexity of the comparator is measured by its energy and its sparsity on a user-specified dictionary, which offers considerable versatility. Equipped with a wavelet dictionary for example, our framework improves the state-of-the-art bound (Jacobsen & Cutkosky, 2022) by adapting to both ($i$) the magnitude of the comparator average $||\bar u||=||\sum_{t=1}^Tu_t/T||$, rather than the maximum $\max_t||u_t||$; and ($ii$) the comparator variability $\sum_{t=1}^T||u_t-\bar u||$, rather than the uncentered sum $\sum_{t=1}^T||u_t||$. Furthermore, our analysis is simpler due to decoupling function approximation from regret minimization.

View on arXiv PDF

Similar