Self-Supervised Goal-Conditioned Pick and Place
This addresses the challenge of enabling robots to learn manipulation tasks autonomously without human intervention, though it appears incremental as it builds on existing self-supervised and goal-conditioned approaches.
The paper tackles the problem of learning from autonomously collected robot data without human-labeled supervision by developing pixel-wise object representations from unsupervised pick and place data that generalize to new objects, and demonstrates its utility in a simulated grasping environment.
Robots have the capability to collect large amounts of data autonomously by interacting with objects in the world. However, it is often not obvious \emph{how} to learning from autonomously collected data without human-labeled supervision. In this work we learn pixel-wise object representations from unsupervised pick and place data that generalize to new objects. We introduce a novel framework for using these representations in order to predict where to pick and where to place in order to match a goal image. Finally, we demonstrate the utility of our approach in a simulated grasping environment.