LGJun 19, 2022

A Survey on Model-based Reinforcement Learning

Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen, Weinan Zhang, Yang Yu

arXiv:2206.09328v128.5171 citationsh-index: 45

Originality Synthesis-oriented

AI Analysis

It provides a comprehensive overview for researchers and practitioners interested in improving sample efficiency and reducing errors in real-world RL applications, but it is incremental as a survey paper.

This survey reviews model-based reinforcement learning (MBRL) with a focus on recent progress in deep RL, analyzing the generalization error between learned environment models and real environments to guide algorithm design for better model learning, usage, and policy training.

Reinforcement learning (RL) solves sequential decision-making problems via a trial-and-error process interacting with the environment. While RL achieves outstanding success in playing complex video games that allow huge trial-and-error, making errors is always undesired in the real world. To improve the sample efficiency and thus reduce the errors, model-based reinforcement learning (MBRL) is believed to be a promising direction, which builds environment models in which the trial-and-errors can take place without real costs. In this survey, we take a review of MBRL with a focus on the recent progress in deep RL. For non-tabular environments, there is always a generalization error between the learned environment model and the real environment. As such, it is of great importance to analyze the discrepancy between policy training in the environment model and that in the real environment, which in turn guides the algorithm design for better model learning, model usage, and policy training. Besides, we also discuss the recent advances of model-based techniques in other forms of RL, including offline RL, goal-conditioned RL, multi-agent RL, and meta-RL. Moreover, we discuss the applicability and advantages of MBRL in real-world tasks. Finally, we end this survey by discussing the promising prospects for the future development of MBRL. We think that MBRL has great potential and advantages in real-world applications that were overlooked, and we hope this survey could attract more research on MBRL.

View on arXiv PDF

Similar