Parameter-free Regret in High Probability with Heavy Tails
This addresses a gap in online learning for unbounded domains, providing high-probability guarantees where previous work only offered in-expectation results, which is incremental but important for robust optimization.
The paper tackles the problem of online convex optimization over unbounded domains with heavy-tailed subgradients, achieving parameter-free regret in high probability, specifically regret $ ilde{O}(\| \mathbf{u} \| T^{1/\mathfrak{p}} \log (1/δ))$ with probability at most $δ$.
We present new algorithms for online convex optimization over unbounded domains that obtain parameter-free regret in high-probability given access only to potentially heavy-tailed subgradient estimates. Previous work in unbounded domains considers only in-expectation results for sub-exponential subgradients. Unlike in the bounded domain case, we cannot rely on straight-forward martingale concentration due to exponentially large iterates produced by the algorithm. We develop new regularization techniques to overcome these problems. Overall, with probability at most $δ$, for all comparators $\mathbf{u}$ our algorithm achieves regret $\tilde{O}(\| \mathbf{u} \| T^{1/\mathfrak{p}} \log (1/δ))$ for subgradients with bounded $\mathfrak{p}^{th}$ moments for some $\mathfrak{p} \in (1, 2]$.