LGOct 9, 2025

Design-Based Bandits Under Network Interference: Trade-Off Between Regret and Statistical Inference

Peking U
arXiv:2510.07646v21 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses the trade-off between exploration and statistical inference in networked decision-making, which is incremental as it extends prior single-unit work to network settings.

The paper tackles the problem of balancing regret minimization and inference accuracy in multi-armed bandits with network interference, establishing a theoretical Pareto frontier and introducing an algorithm with anytime-valid confidence sequences.

In multi-armed bandits with network interference (MABNI), the action taken by one node can influence the rewards of others, creating complex interdependence. While existing research on MABNI largely concentrates on minimizing regret, it often overlooks the crucial concern that an excessive emphasis on the optimal arm can undermine the inference accuracy for sub-optimal arms. Although initial efforts have been made to address this trade-off in single-unit scenarios, these challenges have become more pronounced in the context of MABNI. In this paper, we establish, for the first time, a theoretical Pareto frontier characterizing the trade-off between regret minimization and inference accuracy in adversarial (design-based) MABNI. We further introduce an anytime-valid asymptotic confidence sequence along with a corresponding algorithm, $\texttt{EXP3-N-CS}$, specifically designed to balance the trade-off between regret minimization and inference accuracy in this setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes