Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General Games
This provides a novel framework for faster and more efficient learning in multi-agent systems, addressing a bottleneck in game theory and optimization with broad applications in AI and economics.
The paper tackles the problem of accelerating no-regret learning in general games by introducing Cautious Optimism, a meta-algorithm that adaptively controls learning pace to achieve near-optimal O_T(log T) regret in self-play scenarios while maintaining O_T(sqrt T) regret in adversarial settings, with exponential improvements in dimension dependence over prior works.
We introduce Cautious Optimism, a framework for substantially faster regularized learning in general games. Cautious Optimism, as a variant of Optimism, adaptively controls the learning pace in a dynamic, non-monotone manner to accelerate no-regret learning dynamics. Cautious Optimism takes as input any instance of Follow-the-Regularized-Leader (FTRL) and outputs an accelerated no-regret learning algorithm (COFTRL) by pacing the underlying FTRL with minimal computational overhead. Importantly, it retains uncoupledness, that is, learners do not need to know other players' utilities. Cautious Optimistic FTRL (COFTRL) achieves near-optimal $O_T(\log T)$ regret in diverse self-play (mixing and matching regularizers) while preserving the optimal $O_T(\sqrt{T})$ regret in adversarial scenarios. In contrast to prior works (e.g., Syrgkanis et al. [2015], Daskalakis et al. [2021]), our analysis does not rely on monotonic step sizes, showcasing a novel route for fast learning in general games. Moreover, instances of COFTRL achieve new state-of-the-art regret minimization guarantees in general convex games, exponentially improving the dependence on the dimension of the action space $d$ over previous works [Farina et al., 2022a].