LG MLJun 1, 2022

An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

Geovani Rizk, Igor Colin, Albert Thomas, Rida Laraki, Yann Chevaleyre

arXiv:2206.00466v23.31 citationsh-index: 31

Originality Highly original

AI Analysis

This addresses a combinatorial NP-hard problem in multi-agent bandit settings, providing a novel algorithmic solution for regret optimization in graphical bilinear interactions.

The authors tackled the Graphical Bilinear Bandits problem, where agents in a graph interact in a stochastic bilinear bandit game, by proposing the first regret-based algorithm using optimism in the face of uncertainty, achieving an upper bound of $ ilde{O}(\sqrt{T})$ on the $\alpha$-regret and demonstrating its validity through experiments.

We propose the first regret-based approach to the Graphical Bilinear Bandits problem, where $n$ agents in a graph play a stochastic bilinear bandit game with each of their neighbors. This setting reveals a combinatorial NP-hard problem that prevents the use of any existing regret-based algorithm in the (bi-)linear bandit literature. In this paper, we fill this gap and present the first regret-based algorithm for graphical bilinear bandits using the principle of optimism in the face of uncertainty. Theoretical analysis of this new method yields an upper bound of $\tilde{O}(\sqrt{T})$ on the $α$-regret and evidences the impact of the graph structure on the rate of convergence. Finally, we show through various experiments the validity of our approach.

View on arXiv PDF

Similar