LG MLOct 10, 2019

Online Learning Using Only Peer Prediction

arXiv:1910.04382v21.0

Originality Incremental advance

AI Analysis

This addresses the challenge of learning from expert predictions in scenarios where direct feedback is unavailable, though it appears incremental as it builds on classical online learning frameworks.

The paper tackles the problem of online learning without direct loss feedback by using peer prediction, and shows that under a peer calibration condition, standard algorithms achieve bounded regret with respect to the unrevealed ground truth.

This paper considers a variant of the classical online learning problem with expert predictions. Our model's differences and challenges are due to lacking any direct feedback on the loss each expert incurs at each time step $t$. We propose an approach that uses peer prediction and identify conditions where it succeeds. Our techniques revolve around a carefully designed peer score function $s()$ that scores experts' predictions based on the peer consensus. We show a sufficient condition, that we call \emph{peer calibration}, under which standard online learning algorithms using loss feedback computed by the carefully crafted $s()$ have bounded regret with respect to the unrevealed ground truth values. We then demonstrate how suitable $s()$ functions can be derived for different assumptions and models.

View on arXiv PDF

Similar