ML LGMay 25, 2021

Bias-Robust Bayesian Optimization via Dueling Bandits

arXiv:2105.11802v211.012 citations

Originality Incremental advance

AI Analysis

This addresses robustness issues in optimization for applications like clinical trials or A/B testing where biases can skew results, though it builds incrementally on existing bandit methods.

The paper tackles Bayesian optimization with adversarially biased observations, such as from hidden confounders, by reducing it to a dueling bandit model and proposing a novel information-directed sampling approach, resulting in the first efficient kernelized algorithm with cumulative regret guarantees.

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation.

View on arXiv PDF

Similar