ML LGApr 17, 2019

X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks

Léonard Torossian, Aurélien Garivier, Victor Picheny

arXiv:1904.08205v38.318 citations

Originality Incremental advance

AI Analysis

This addresses risk-averse decision-making in fields like finance and medicine, though it is incremental as it builds on existing bandit and black-box optimization methods.

The paper tackles the problem of risk-averse optimization in stochastic black-box functions, such as in agriculture or finance, by proposing the StoROO algorithm to optimize quantiles and CVaR, with numerical experiments showing dramatic improvements from tight bounds.

We propose and analyze StoROO, an algorithm for risk optimization on stochastic black-box functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine, biology or finance, we do not focus on the mean payoff but on generic functionals of the return distribution. We provide a generic regret analysis of StoROO and illustrate its applicability with two examples: the optimization of quantiles and CVaR. Inspired by the bandit literature and black-box mean optimizers, StoROO relies on the possibility to construct confidence intervals for the targeted functional based on random-size samples. We detail their construction in the case of quantiles, providing tight bounds based on Kullback-Leibler divergence. We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.

View on arXiv PDF

Similar