Sparsity-Based Interpolation of External, Internal and Swap Regret
This work addresses the challenge of designing adaptive algorithms for regret minimization in online learning, offering incremental improvements in intermediate regimes for researchers in optimization and machine learning.
This paper tackles the problem of interpolating performance metrics like external, internal, and swap regret in online learning via φ-regret minimization, achieving an instance-adaptive bound of Õ(min{√(d-d_φ^unif+1), √(d-d_φ^self)}·√T) that recovers optimal bounds in extreme cases and improves upon existing algorithms in intermediate regimes.
Focusing on the expert problem in online learning, this paper studies the interpolation of several performance metrics via $φ$-regret minimization, which measures the total loss of an algorithm by its regret with respect to an arbitrary action modification rule $φ$. With $d$ experts and $T\gg d$ rounds in total, we present a single algorithm achieving the instance-adaptive $φ$-regret bound \begin{equation*} \tilde O\left(\min\left\{\sqrt{d-d^{\mathrm{unif}}_φ+1},\sqrt{d-d^{\mathrm{self}}_φ}\right\}\cdot\sqrt{T}\right), \end{equation*} where $d^{\mathrm{unif}}_φ$ is the maximum amount of experts modified identically by $φ$, and $d^{\mathrm{self}}_φ$ is the amount of experts that $φ$ trivially modifies to themselves. By recovering the optimal $O(\sqrt{T\log d})$ external regret bound when $d^{\mathrm{unif}}_φ=d$, the standard $\tilde O(\sqrt{T})$ internal regret bound when $d^{\mathrm{self}}_φ=d-1$ and the optimal $\tilde O(\sqrt{dT})$ swap regret bound in the worst case, we improve upon existing algorithms in the intermediate regimes. In addition, the computational complexity of our algorithm matches that of the standard swap-regret minimization algorithm due to (Blum and Mansour, 2007). Technically, building on the well-known reduction from $φ$-regret minimization to external regret minimization on stochastic matrices, our main idea is to further convert the latter to online linear regression using Haar-wavelet-inspired matrix features. Then, by associating the complexity of each $φ$ instance with its sparsity under the feature representation, we apply techniques from comparator-adaptive online learning to exploit the sparsity in this regression subroutine.