LGAIJun 17, 2025

Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits

arXiv:2506.14988v31 citationsh-index: 8
Originality Incremental advance
AI Analysis

It addresses fairness and efficiency in multi-agent decision-making under limited information, which is an incremental improvement in bandit algorithms.

The paper tackles the problem of ensuring fair outcomes across agents while maximizing performance in multi-agent multi-armed bandits by introducing a novel probing framework to gather information before allocation. In experiments, it outperforms baselines with better fairness and efficiency.

We propose a multi-agent multi-armed bandit (MA-MAB) framework aimed at ensuring fair outcomes across agents while maximizing overall system performance. A key challenge in this setting is decision-making under limited information about arm rewards. To address this, we introduce a novel probing framework that strategically gathers information about selected arms before allocation. In the offline setting, where reward distributions are known, we leverage submodular properties to design a greedy probing algorithm with a provable performance bound. For the more complex online setting, we develop an algorithm that achieves sublinear regret while maintaining fairness. Extensive experiments on synthetic and real-world datasets show that our approach outperforms baseline methods, achieving better fairness and efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes