LGFeb 7, 2024

Randomized Confidence Bounds for Stochastic Partial Monitoring

arXiv:2402.05002v22 citationsh-index: 2ICML
AI Analysis

This work addresses the challenge of optimizing actions with limited feedback in sequential decision-making, applicable to real-world scenarios like monitoring classification system error rates, though it appears incremental as it builds on existing PM frameworks.

The paper tackles the problem of sequential learning with incomplete feedback in stochastic partial monitoring by introducing randomized confidence bound strategies, achieving favorable performance against state-of-the-art baselines in multiple games.

The partial monitoring (PM) framework provides a theoretical formulation of sequential learning problems with incomplete feedback. On each round, a learning agent plays an action while the environment simultaneously chooses an outcome. The agent then observes a feedback signal that is only partially informative about the (unobserved) outcome. The agent leverages the received feedback signals to select actions that minimize the (unobserved) cumulative loss. In contextual PM, the outcomes depend on some side information that is observable by the agent before selecting the action on each round. In this paper, we consider the contextual and non-contextual PM settings with stochastic outcomes. We introduce a new class of PM strategies based on the randomization of deterministic confidence bounds. We also extend regret guarantees to settings where existing stochastic strategies are not applicable. Our experiments show that the proposed RandCBP and RandCBPsidestar strategies have favorable performance against state-of-the-art baselines in multiple PM games. To advocate for the adoption of the PM framework, we design a use case on the real-world problem of monitoring the error rate of any deployed classification system.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes