LG AI RO MLJun 26, 2020

A Unifying Framework for Reinforcement Learning and Planning

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker

arXiv:2006.15009v412.816 citations

Originality Synthesis-oriented

AI Analysis

This work provides a conceptual tool for researchers in AI to better understand and navigate the algorithmic design space of planning and reinforcement learning, though it is incremental in nature.

The paper tackles the challenge of unifying reinforcement learning and planning for sequential decision making by proposing a framework (FRAP) that identifies common dimensions across algorithms, and it demonstrates this by comparing various known algorithms within the framework.

Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.

View on arXiv PDF

Similar