MLLGMay 25, 2021

Bias-Robust Bayesian Optimization via Dueling Bandits

arXiv:2105.11802v212 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues in optimization for applications like clinical trials or A/B testing where biases can skew results, though it builds incrementally on existing bandit methods.

The paper tackles Bayesian optimization with adversarially biased observations, such as from hidden confounders, by reducing it to a dueling bandit model and proposing a novel information-directed sampling approach, resulting in the first efficient kernelized algorithm with cumulative regret guarantees.

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes