LGMLApr 15, 2020

Bootstrapped model learning and error correction for planning with uncertainty in model-based RL

arXiv:2004.07155v14 citations
AI Analysis

This addresses the challenge of poor planning performance due to imperfect models in RL, but it is incremental as it builds on existing uncertainty-aware methods.

The paper tackles the problem of model misspecification in model-based reinforcement learning by proposing a bootstrapped multi-headed neural network to learn distributions of future states and rewards, along with a global error correction filter, and demonstrates increased performance and stability on Minipacman.

Having access to a forward model enables the use of planning algorithms such as Monte Carlo Tree Search and Rolling Horizon Evolution. Where a model is unavailable, a natural aim is to learn a model that reflects accurately the dynamics of the environment. In many situations it might not be possible and minimal glitches in the model may lead to poor performance and failure. This paper explores the problem of model misspecification through uncertainty-aware reinforcement learning agents. We propose a bootstrapped multi-headed neural network that learns the distribution of future states and rewards. We experiment with a number of schemes to extract the most likely predictions. Moreover, we also introduce a global error correction filter that applies high-level constraints guided by the context provided through the predictive distribution. We illustrate our approach on Minipacman. The evaluation demonstrates that when dealing with imperfect models, our methods exhibit increased performance and stability, both in terms of model accuracy and in its use within a planning algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes