No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation
This work addresses the challenge of efficient learning in multi-agent environments with noisy feedback, offering faster rates and adaptivity for game theory and optimization applications, though it is incremental in building on existing no-regret algorithms.
The paper tackles the problem of minimizing regret in continuous games with noisy feedback, showing that players can achieve constant regret with multiplicative noise using an optimistic gradient scheme with learning rate separation, and proposes an adaptive method that attains similar guarantees without hyperparameter tuning.
We examine the problem of regret minimization when the learner is involved in a continuous game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is possible to achieve significantly lower regret relative to fully adversarial environments. We study this problem in the context of variationally stable games (a class of continuous games which includes all convex-concave and monotone games), and when the players only have access to noisy estimates of their individual payoff gradients. If the noise is additive, the game-theoretic and purely adversarial settings enjoy similar regret guarantees; however, if the noise is multiplicative, we show that the learners can, in fact, achieve constant regret. We achieve this faster rate via an optimistic gradient scheme with learning rate separation -- that is, the method's extrapolation and update steps are tuned to different schedules, depending on the noise profile. Subsequently, to eliminate the need for delicate hyperparameter tuning, we propose a fully adaptive method that attains nearly the same guarantees as its non-adapted counterpart, while operating without knowledge of either the game or of the noise profile.