Stabilized Nested Rollout Policy Adaptation
This work provides an incremental improvement to an existing Monte Carlo search algorithm, benefiting researchers and practitioners working on single-player games and optimization problems.
This paper proposes a modification to the Nested Rollout Policy Adaptation (NRPA) algorithm to enhance its stability. The improved algorithm demonstrates better performance across various application domains, including SameGame, Traveling Salesman with Time Windows, and Expression Discovery.
Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to modify NRPA in order to improve the stability of the algorithm. Experiments show it improves the algorithm for different application domains: SameGame, Traveling Salesman with Time Windows and Expression Discovery.