ROAICVAug 14, 2017

Deep Object-Centric Representations for Generalizable Robot Learning

arXiv:1708.04225v3125 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of generalizable perception for robots in complex environments, representing an incremental improvement by integrating existing models with novel attentional mechanisms.

The paper tackles the problem of robotic manipulation in open-world scenarios by using pretrained visual models as an object-centric prior to improve perception and policy learning, achieving good generalization across object instances with very few samples.

Robotic manipulation in complex open-world scenarios requires both reliable physical manipulation skills and effective and generalizable perception. In this paper, we propose a method where general purpose pretrained visual models serve as an object-centric prior for the perception system of a learned policy. We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy. A task-independent meta-attention locates possible objects in the scene, and a task-specific attention identifies which objects are predictive of the trajectories. The scope of the task-specific attention is easily adjusted by showing demonstrations with distractor objects or with diverse relevant objects. Our results indicate that this approach exhibits good generalization across object instances using very few samples, and can be used to learn a variety of manipulation tasks using reinforcement learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes