MLLGJul 11, 2023

Tracking Most Significant Shifts in Nonparametric Contextual Bandits

arXiv:2307.05341v29 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses the challenge of adapting to changes in contextual bandits for applications like online recommendation systems, but it is incremental as it builds on prior work on non-stationary multi-armed bandits.

The paper tackles the problem of nonparametric contextual bandits with time-varying reward functions, establishing a minimax dynamic regret rate and proposing a new notion of change called 'experienced significant shifts' that accounts for locality, which they show can be adapted to without prior knowledge of change parameters.

We study nonparametric contextual bandits where Lipschitz mean reward functions may change over time. We first establish the minimax dynamic regret rate in this less understood setting in terms of number of changes $L$ and total-variation $V$, both capturing all changes in distribution over context space, and argue that state-of-the-art procedures are suboptimal in this setting. Next, we tend to the question of an adaptivity for this setting, i.e. achieving the minimax rate without knowledge of $L$ or $V$. Quite importantly, we posit that the bandit problem, viewed locally at a given context $X_t$, should not be affected by reward changes in other parts of context space $\cal X$. We therefore propose a notion of change, which we term experienced significant shifts, that better accounts for locality, and thus counts considerably less changes than $L$ and $V$. Furthermore, similar to recent work on non-stationary MAB (Suk & Kpotufe, 2022), experienced significant shifts only count the most significant changes in mean rewards, e.g., severe best-arm changes relevant to observed contexts. Our main result is to show that this more tolerant notion of change can in fact be adapted to.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes