ROSep 10, 2020

Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

arXiv:2009.05085v1112 citations
AI Analysis

This work addresses the problem of improving vision-based reinforcement learning for robotic manipulation, offering incremental advancements in generalization and transfer to real-world scenarios.

The paper tackles the challenge of learning predictive models for robotic manipulation from high-dimensional images by introducing self-supervised visual correspondence learning, demonstrating performance improvements over autoencoder-based methods in simulation and real-world hardware experiments.

Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. Videos and supplementary materials available at https://sites.google.com/view/keypointsintothefuture

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes