PMCTS: Particle Monte Carlo Tree Search for Principled Parallelized Inference Time Scaling
This work addresses the challenge of parallelizing MCTS for neural network evaluations, offering a theoretically grounded solution for practitioners needing efficient inference-time scaling.
PMCTS introduces a principled parallel MCTS algorithm that scales with parallel compute and preserves formal policy improvement guarantees, outperforming heuristic baselines across domains.
Monte Carlo Tree Search (MCTS) is a widely used approach for policy improvement through search with increasing popularity for real world applications. Due to the sequential and deterministic nature of its search, runtime-scaling of MCTS with parallel compute remains a major challenge. We introduce Particle MCTS (PMCTS), to our knowledge the first principled parallel MCTS algorithm which is suited for neural network evaluations and can preserve formal policy improvement guarantees. Empirically, PMCTS scales well with parallel compute and significantly outperforms the popular heuristic-based baselines across domains.