Exploring Adaptive MCTS with TD Learning in miniXCOM
This work addresses the need for faster adaptation in game AI for turn-based tactical games, though it appears incremental as it builds on existing MCTS and TD learning methods.
The paper tackled the problem of reducing training time in Monte Carlo tree search (MCTS) by introducing MCTS-TD, an adaptive algorithm that integrates temporal difference learning without pre-training, and demonstrated improved performance against opponents in the game miniXCOM.
In recent years, Monte Carlo tree search (MCTS) has achieved widespread adoption within the game community. Its use in conjunction with deep reinforcement learning has produced success stories in many applications. While these approaches have been implemented in various games, from simple board games to more complicated video games such as StarCraft, the use of deep neural networks requires a substantial training period. In this work, we explore on-line adaptivity in MCTS without requiring pre-training. We present MCTS-TD, an adaptive MCTS algorithm improved with temporal difference learning. We demonstrate our new approach on the game miniXCOM, a simplified version of XCOM, a popular commercial franchise consisting of several turn-based tactical games, and show how adaptivity in MCTS-TD allows for improved performances against opponents.