LGMar 10, 2025

Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference

arXiv:2503.07555v24 citationsh-index: 5
Originality Highly original
AI Analysis

This addresses the computational challenge of exponentially large action spaces in bandit problems with interference, which is important for applications like social networks or marketing, though it builds incrementally on prior graph-based bandit work.

The paper tackles the problem of multi-armed bandits with network interference, where rewards depend on neighbors' treatments in a graph, by proposing a novel algorithm that uses local graph structure to minimize regret. It provides graph-dependent upper bounds that improve over prior work, shows near-optimality with matching lower bounds up to logarithmic factors, and demonstrates outperformance of baseline methods in experiments.

We study multi-armed bandits under network interference, where each unit's reward depends on its own treatment and those of its neighbors in a given graph. This induces an exponentially large action space, making standard approaches computationally impractical. We propose a novel algorithm that uses the local graph structure to minimize regret. We derive a graph-dependent upper bound on cumulative regret that improves over prior work. Additionally, we provide the first lower bounds for bandits with arbitrary network interference, where each bound involves a distinct structural property of the graph. These bounds show that for both dense and sparse graphs, our algorithm is nearly optimal, with matching upper and lower bounds up to logarithmic factors. When the interference graph is unknown, a variant of our algorithm is Pareto optimal: no algorithm can uniformly outperform it across all instances. We complement our theoretical results with numerical experiments, showing that our approach outperforms the baseline methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes