AIFeb 7, 2023

Towards Understanding the Effects of Evolving the MCTS UCT Selection Policy

arXiv:2302.03352v13 citationsh-index: 20
Originality Incremental advance
AI Analysis

This work provides insights for researchers and practitioners in AI and optimization on when to use evolved selection policies in MCTS, though it is incremental as it builds on existing methods without introducing a new paradigm.

The paper tackled the problem of understanding when evolving the MCTS UCT selection policy improves performance, demonstrating that evolved variants are beneficial in multimodal and deceptive scenarios while UCT remains robust in unimodal ones, with specific performance gains shown across five different function types.

Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. The success of MCTS depends heavily on how the MCTS statistical tree is built and the selection policy plays a fundamental role in this. A particular selection policy that works particularly well, widely adopted in MCTS, is the Upper Confidence Bounds for Trees, referred to as UCT. Other more sophisticated bounds have been proposed by the community with the goal to improve MCTS performance on particular problems. Thus, it is evident that while the MCTS UCT behaves generally well, some variants might behave better. As a result of this, multiple works have been proposed to evolve a selection policy to be used in MCTS. Although all these works are inspiring, none of them have carried out an in-depth analysis shedding light under what circumstances an evolved alternative of MCTS UCT might be beneficial in MCTS due to focusing on a single type of problem. In sharp contrast to this, in this work we use five functions of different nature, going from a unimodal function, covering multimodal functions to deceptive functions. We demonstrate how the evolution of the MCTS UCT might be beneficial in multimodal and deceptive scenarios, whereas the MCTS UCT is robust in unimodal scenarios and competitive in the rest of the scenarios used in this study.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes