All Models are Wrong, Knowing Where is Useful: On Model Uncertainty in Reinforcement Learning
For researchers in MBRL and robotics, this work provides a method to reduce model exploitation, improving data efficiency and safety.
This paper addresses model exploitation in model-based reinforcement learning (MBRL) due to inaccuracies in learned dynamics models. It presents a framework for handling uncertainty that mitigates this issue, enabling direct learning on hardware and safe exploration.
Model-based reinforcement learning (MBRL) infers information about the environment from a learned dynamics model and bears the potential to address open problems such as data efficient and safe learning in robotics. However, inaccuracies of the learned dynamics model are typically exploited by the agent, substantially hampering the capabilities of MBRL methods. We present a framework for dealing with inaccuracies of probabilistic models through targeted handling of uncertainty that effectively mitigates model exploitation. We present recent successes in learning directly on hardware and safe exploration, and discuss future directions for uncertainty-aware MBRL.