MAAIMar 2

Boltzmann-based Exploration for Robust Decentralized Multi-Agent Planning

arXiv:2603.02154v1h-index: 1
Originality Incremental advance
AI Analysis

This provides a robust solution for cooperative multi-agent planning, addressing challenges in deceptive environments, though it appears incremental as it adapts Boltzmann exploration from single-agent to multi-agent contexts.

The paper tackles the problem of decentralized multi-agent planning in sparse or skewed reward environments by introducing Coordinated Boltzmann MCTS (CB-MCTS), which replaces deterministic UCT with a stochastic Boltzmann policy and decaying entropy bonus, resulting in outperforming Dec-MCTS in deceptive scenarios while remaining competitive on standard benchmarks.

Decentralized Monte Carlo Tree Search (Dec-MCTS) is widely used for cooperative multi-agent planning but struggles in sparse or skewed reward environments. We introduce Coordinated Boltzmann MCTS (CB-MCTS), which replaces deterministic UCT with a stochastic Boltzmann policy and a decaying entropy bonus for sustained yet focused exploration. While Boltzmann exploration has been studied in single-agent MCTS, applying it in multi-agent systems poses unique challenges. CB-MCTS is the first to address this. We analyze CB-MCTS in the simple-regret setting and show in simulations that it outperforms Dec-MCTS in deceptive scenarios and remains competitive on standard benchmarks, providing a robust solution for multi-agent planning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes