AILGJul 27, 2023

Thinker: Learning to Plan and Act

arXiv:2307.14993v212 citationsh-index: 24
Originality Highly original
AI Analysis

This addresses the challenge of automating planning in complex environments for reinforcement learning, representing a novel advancement rather than an incremental improvement.

The paper tackles the problem of enabling reinforcement learning agents to autonomously plan using a learned world model, resulting in state-of-the-art performance in Sokoban and competitive results in Atari 2600 benchmarks.

We propose the Thinker algorithm, a novel approach that enables reinforcement learning agents to autonomously interact with and utilize a learned world model. The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model. These model-interaction actions enable agents to perform planning by proposing alternative plans to the world model before selecting a final action to execute in the environment. This approach eliminates the need for handcrafted planning algorithms by enabling the agent to learn how to plan autonomously and allows for easy interpretation of the agent's plan with visualization. We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark, where the Thinker algorithm achieves state-of-the-art performance and competitive results, respectively. Visualizations of agents trained with the Thinker algorithm demonstrate that they have learned to plan effectively with the world model to select better actions. Thinker is the first work showing that an RL agent can learn to plan with a learned world model in complex environments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes