LGAIRODec 2, 2021

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

arXiv:2112.01163v15 citations
Originality Highly original
AI Analysis

This work addresses robust robotic control from pixels for robotics applications, representing a strong domain-specific advancement.

The paper tackled the challenge of learning world models from high-dimensional pixel observations in unconstrained robotic environments by developing a contrastive recurrent state-space model that predicts the next observation, achieving state-of-the-art results on the Distracting Control Suite benchmark.

Modeling the world can benefit robot learning by providing a rich training signal for shaping an agent's latent state space. However, learning world models in unconstrained environments over high-dimensional observation spaces such as images is challenging. One source of difficulty is the presence of irrelevant but hard-to-model background distractions, and unimportant visual details of task-relevant entities. We address this issue by learning a recurrent latent dynamics model which contrastively predicts the next observation. This simple model leads to surprisingly robust robotic control even with simultaneous camera, background, and color distractions. We outperform alternatives such as bisimulation methods which impose state-similarity measures derived from divergence in future reward or future optimal actions. We obtain state-of-the-art results on the Distracting Control Suite, a challenging benchmark for pixel-based robotic control.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes