LGMLOct 1, 2018

Risk-Averse Stochastic Convex Bandit

arXiv:1810.00737v136 citations
Originality Highly original
AI Analysis

This addresses risk-aversion in online convex bandit problems, which is important for applications like clinical trials and finance, and is presented as the first attempt in this area.

The authors tackled the problem of online convex optimization with bandit feedback for risk-averse decision-makers, motivated by applications in clinical trials and finance. They proposed two algorithms, with the second achieving (almost) optimal regret bounds in terms of the number of rounds.

Motivated by applications in clinical trials and finance, we study the problem of online convex optimization (with bandit feedback) where the decision maker is risk-averse. We provide two algorithms to solve this problem. The first one is a descent-type algorithm which is easy to implement. The second algorithm, which combines the ellipsoid method and a center point device, achieves (almost) optimal regret bounds with respect to the number of rounds. To the best of our knowledge this is the first attempt to address risk-aversion in the online convex bandit problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes