Prediction with Corrupted Expert Advice
This addresses a fundamental online learning problem for researchers and practitioners, but it is incremental as it builds on classical algorithms by adapting them to corrupted feedback.
The paper tackles the problem of prediction with expert advice in a setting with adversarial corruption of feedback, proving that a variant of Multiplicative Weights with decreasing step sizes achieves constant regret and performs optimally regardless of corruption magnitude, while revealing a surprising disparity where OMD is strictly inferior to FTRL in this regime.
We revisit the fundamental problem of prediction with expert advice, in a setting where the environment is benign and generates losses stochastically, but the feedback observed by the learner is subject to a moderate adversarial corruption. We prove that a variant of the classical Multiplicative Weights algorithm with decreasing step sizes achieves constant regret in this setting and performs optimally in a wide range of environments, regardless of the magnitude of the injected corruption. Our results reveal a surprising disparity between the often comparable Follow the Regularized Leader (FTRL) and Online Mirror Descent (OMD) frameworks: we show that for experts in the corrupted stochastic regime, the regret performance of OMD is in fact strictly inferior to that of FTRL.