LG AI MLJun 15, 2024

Graph Neural Thompson Sampling

arXiv:2406.10686v24.61 citations

Originality Incremental advance

AI Analysis

This work addresses the graph action bandit problem for applications involving graph data, representing an incremental improvement by combining GNNs with Thompson Sampling.

The paper tackles the online decision-making problem with graph-structured data by proposing GNN-TS, a Graph Neural Network-powered Thompson Sampling algorithm, which achieves a state-of-the-art sub-linear regret bound of order ̃O((̃d T)^{1/2}) and scales independently of the number of graph nodes.

We consider an online decision-making problem with a reward function defined over graph-structured data. We formally formulate the problem as an instance of graph action bandit. We then propose \texttt{GNN-TS}, a Graph Neural Network (GNN) powered Thompson Sampling (TS) algorithm which employs a GNN approximator for estimating the mean reward function and the graph neural tangent features for uncertainty estimation. We prove that, under certain boundness assumptions on the reward function, GNN-TS achieves a state-of-the-art regret bound which is (1) sub-linear of order $\tilde{\mathcal{O}}((\tilde{d} T)^{1/2})$ in the number of interaction rounds, $T$, and a notion of effective dimension $\tilde{d}$, and (2) independent of the number of graph nodes. Empirical results validate that our proposed \texttt{GNN-TS} exhibits competitive performance and scales well on graph action bandit problems.

View on arXiv PDF

Similar