AI LGFeb 12, 2021

Planning and Learning Using Adaptive Entropy Tree Search

Piotr Kozakowski, Mikołaj Pacek, Piotr Miłoś

arXiv:2102.06808v34.53 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in AI for improving planning algorithms, though it is incremental as it builds on existing maximum entropy methods.

The paper tackles the problem of combining tree-based planning with deep learning by introducing Adaptive Entropy Tree Search (ANTS), which resolves failures in maximum entropy planning methods and significantly outperforms PUCT on the Atari benchmark.

Recent breakthroughs in Artificial Intelligence have shown that the combination of tree-based planning with deep learning can lead to superior performance. We present Adaptive Entropy Tree Search (ANTS) - a novel algorithm combining planning and learning in the maximum entropy paradigm. Through a comprehensive suite of experiments on the Atari benchmark we show that ANTS significantly outperforms PUCT, the planning component of the state-of-the-art AlphaZero system. ANTS builds upon recent work on maximum entropy planning methods - which however, as we show, fail in combination with learning. ANTS resolves this issue to reach state-of-the-art performance. We further find that ANTS exhibits superior robustness to different hyperparameter choices, compared to the previous algorithms. We believe that the high performance and robustness of ANTS can bring tree search planning one step closer to wide practical adoption.

View on arXiv PDF Code

Similar