AILGFeb 12, 2021

Planning and Learning Using Adaptive Entropy Tree Search

arXiv:2102.06808v33 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in AI for improving planning algorithms, though it is incremental as it builds on existing maximum entropy methods.

The paper tackles the problem of combining tree-based planning with deep learning by introducing Adaptive Entropy Tree Search (ANTS), which resolves failures in maximum entropy planning methods and significantly outperforms PUCT on the Atari benchmark.

Recent breakthroughs in Artificial Intelligence have shown that the combination of tree-based planning with deep learning can lead to superior performance. We present Adaptive Entropy Tree Search (ANTS) - a novel algorithm combining planning and learning in the maximum entropy paradigm. Through a comprehensive suite of experiments on the Atari benchmark we show that ANTS significantly outperforms PUCT, the planning component of the state-of-the-art AlphaZero system. ANTS builds upon recent work on maximum entropy planning methods - which however, as we show, fail in combination with learning. ANTS resolves this issue to reach state-of-the-art performance. We further find that ANTS exhibits superior robustness to different hyperparameter choices, compared to the previous algorithms. We believe that the high performance and robustness of ANTS can bring tree search planning one step closer to wide practical adoption.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes