RO AI LGApr 18, 2022

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Homanga Bharadhwaj, Mohammad Babaeizadeh, Dumitru Erhan, Sergey Levine

arXiv:2204.08585v122.436 citationsh-index: 166

Originality Incremental advance

AI Analysis

This addresses the challenge of handling complex visual observations in RL for robot control, though it is an incremental improvement over existing model-based approaches.

The paper tackles the problem of learning representations in visual model-based RL that prioritize functionally relevant aspects over distractors, resulting in higher sample efficiency and episodic returns compared to state-of-the-art methods.

Model-based reinforcement learning (RL) algorithms designed for handling complex visual observations typically learn some sort of latent state representation, either explicitly or implicitly. Standard methods of this sort do not distinguish between functionally relevant aspects of the state and irrelevant distractors, instead aiming to represent all available information equally. We propose a modified objective for model-based RL that, in combination with mutual information maximization, allows us to learn representations and dynamics for visual model-based RL without reconstruction in a way that explicitly prioritizes functionally relevant factors. The key principle behind our design is to integrate a term inspired by variational empowerment into a state-space model based on mutual information. This term prioritizes information that is correlated with action, thus ensuring that functionally relevant factors are captured first. Furthermore, the same empowerment term also promotes faster exploration during the RL process, especially for sparse-reward tasks where the reward signal is insufficient to drive exploration in the early stages of learning. We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds, and show that the proposed prioritized information objective outperforms state-of-the-art model based RL approaches with higher sample efficiency and episodic returns. https://sites.google.com/view/information-empowerment

View on arXiv PDF

Similar