LGMay 1

NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search

arXiv:2605.0075151.12 citations
AI Analysis

This work addresses the scalability problem of MCTS in cooperative multi-agent domains, offering a practical solution for agents to coordinate under limited search budgets.

NonZero enables tractable multi-agent MCTS by using an interaction-guided proposal rule to explore joint actions via low-dimensional deviations, achieving sublinear local regret and outperforming baselines on MatGame, SMAC, and SMACv2 under matched search budgets.

Monte Carlo Tree Search (MCTS) scales poorly in cooperative multi-agent domains because expansion must consider an exponentially large set of joint actions, severely limiting exploration under realistic search budgets. We propose NonZero, which keeps multi-agent MCTS tractable by running surrogate-guided selection over a low-dimensional nonlinear representation using an interaction-guided proposal rule, instead of directly exploring the full joint-action space. Our exploration uses an interaction score: single-agent deviations are ranked by predicted gain, while two-agent deviations are scored by a mixed-difference measure that reveals coordination benefits even when no single agent can improve alone. We formalize candidate proposal as a bandit problem over local deviations and derive a proposal rule, NonZero, with a sublinear local-regret guarantee for reaching approximate graph-local optima without enumerating the joint-action space. Empirically, NonZero improves sample efficiency and final performance on MatGame, SMAC, and SMACv2 relative to strong model-based and model-free baselines under matched search budgets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes