LG AI MLAug 25, 2019

Dynamics-aware Embeddings

William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta

arXiv:1908.09357v319.058 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of high sample complexity in reinforcement learning for researchers and practitioners, offering an incremental improvement in sample efficiency.

The paper tackles improving sample efficiency in reinforcement learning by proposing a forward prediction objective to learn embeddings of states and action sequences that capture environment dynamics, resulting in efficient policy learning with 1-2 million environment steps for goal-conditioned continuous control from pixels.

In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL). We propose a forward prediction objective for simultaneously learning embeddings of states and action sequences. These embeddings capture the structure of the environment's dynamics, enabling efficient policy learning. We demonstrate that our action embeddings alone improve the sample efficiency and peak performance of model-free RL on control from low-dimensional states. By combining state and action embeddings, we achieve efficient learning of high-quality policies on goal-conditioned continuous control from pixel observations in only 1-2 million environment steps.

View on arXiv PDF Code

Similar