AIAug 3, 2020

Learning to Play Two-Player Perfect-Information Games without Knowledge

arXiv:2008.01188v516 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of developing AI players for games like Hex without expert knowledge, though it appears incremental as it builds on existing reinforcement learning methods.

The paper tackled the problem of learning game state evaluation functions in two-player perfect-information games without prior knowledge, proposing several reinforcement learning techniques that improved play level and surpassed Mohex 3HNN in Hex on sizes 11 and 13.

In this paper, several techniques for learning game state evaluation functions by reinforcement are proposed. The first is a generalization of tree bootstrapping (tree learning): it is adapted to the context of reinforcement learning without knowledge based on non-linear functions. With this technique, no information is lost during the reinforcement learning process. The second is a modification of minimax with unbounded depth extending the best sequences of actions to the terminal states. This modified search is intended to be used during the learning process. The third is to replace the classic gain of a game (+1 / -1) with a reinforcement heuristic. We study particular reinforcement heuristics such as: quick wins and slow defeats ; scoring ; mobility or presence. The four is a new action selection distribution. The conducted experiments suggest that these techniques improve the level of play. Finally, we apply these different techniques to design program-players to the game of Hex (size 11 and 13) surpassing the level of Mohex 3HNN with reinforcement learning from self-play without knowledge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes