Generalised Entropy MDPs and Minimax Regret
This addresses the challenge of prior specification in Bayesian decision-making, but it appears incremental as it builds on existing bandit theory.
The paper tackles the problem of specifying prior beliefs in Bayesian methods by considering worst-case priors, which involves solving a stochastic zero-sum game. It extends results from bandit theory to discover minimax-Bayes policies and discusses their practicality.
Bayesian methods suffer from the problem of how to specify prior beliefs. One interesting idea is to consider worst-case priors. This requires solving a stochastic zero-sum game. In this paper, we extend well-known results from bandit theory in order to discover minimax-Bayes policies and discuss when they are practical.