Bandits on graphs and structures
For researchers in online learning and bandit theory, this thesis provides a survey of contributions on graph and structured bandits, but it is largely a compilation of existing work with no single breakthrough result.
This thesis investigates structural properties of sequential decision problems, focusing on graph-based bandits (spectral bandits, side observations, influence maximization) and large structured action spaces (kernel bandits, polymatroid bandits, function optimization, infinitely many-arms bandits), aiming to bring solutions closer to practical use.
The goal of this thesis is to investigate the structural properties of certain sequential problems in order to bring the solutions closer to a practical use. In the first part, we put a special emphasis on structures that can be represented as graphs on actions. In the second part, we study the large action spaces that can be of exponential size in the number of base actions or even infinite. For graph bandits, we consider the settings of smoothness of rewards (spectral bandits), side observations, and influence maximization. For large structured domains, we cover kernel bandits, polymatroid bandits, bandits for function optimization (including unknown smoothness), and infinitely many-arms bandits. The thesis aspires to be a survey of the author's contributions on graph and structured bandits.