AISep 11, 2018

SAI, a Sensible Artificial Intelligence that plays Go

arXiv:1809.03928v215 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for Go AI enthusiasts and researchers, as it enhances reinforcement learning efficiency in a specific domain.

The authors tackled the problem of improving Go-playing AI by modifying the AlphaGo Zero paradigm to handle multiple komi values, resulting in a very strong playing agent on 7x7 Go.

We propose a multiple-komi modification of the AlphaGo Zero/Leela Zero paradigm. The winrate as a function of the komi is modeled with a two-parameters sigmoid function, so that the neural network must predict just one more variable to assess the winrate for all komi values. A second novel feature is that training is based on self-play games that occasionally branch -- with changed komi -- when the position is uneven. With this setting, reinforcement learning is showed to work on 7x7 Go, obtaining very strong playing agents. As a useful byproduct, the sigmoid parameters given by the network allow to estimate the score difference on the board, and to evaluate how much the game is decided.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes