AIFeb 16, 2018

Monte Carlo Q-learning for General Game Playing

arXiv:1802.05944v213.220 citations

Originality Incremental advance

AI Analysis

This is an incremental improvement for researchers in reinforcement learning and game playing, focusing on GGP as a testbed.

The paper tackles the problem of applying reinforcement learning to General Game Playing (GGP) by implementing Q-learning on small-board games and enhancing it with Monte Carlo Search to create QM-learning, which improves performance over pure Q-learning.

After the recent groundbreaking results of AlphaGo, we have seen a strong interest in reinforcement learning in game playing. General Game Playing (GGP) provides a good testbed for reinforcement learning. In GGP, a specification of games rules is given. GGP problems can be solved by reinforcement learning. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex), to allow comparison to Banerjee et al. As expected, Q-learning converges, although much slower than MCTS. Borrowing an idea from MCTS, we enhance Q-learning with Monte Carlo Search, to give QM-learning. This enhancement improves the performance of pure Q-learning. We believe that QM-learning can also be used to improve performance of reinforcement learning further for larger games, something which we will test in future work.

View on arXiv PDF

Similar