ROLGFeb 10, 2022

Factored World Models for Zero-Shot Generalization in Robotic Manipulation

arXiv:2202.05333v111 citations
Originality Incremental advance
AI Analysis

It addresses the problem of scaling world models for multi-object robotic manipulation, enabling zero-shot generalization, which is incremental as it builds on existing contrastive methods.

The paper tackles the combinatorial explosion of states in robotic manipulation by learning object-factored world models that are equivariant to object permutations, achieving successful zero-shot generalization to novel tasks with only a minor performance decrease and enabling planning for up to 12 pick-and-place actions.

World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of objects. Previous object-factored models were limited either by their inability to model actions, or by their inability to plan for complex manipulation tasks. We build on recent contrastive methods for training object-factored world models, which we extend to model continuous robot actions and to accurately predict the physics of robotic pick-and-place. To do so, we use a residual stack of graph neural networks that receive action information at multiple levels in both their node and edge neural networks. Crucially, our learned model can make predictions about tasks not represented in the training data. That is, we demonstrate successful zero-shot generalization to novel tasks, with only a minor decrease in model performance. Moreover, we show that an ensemble of our models can be used to plan for tasks involving up to 12 pick and place actions using heuristic search. We also demonstrate transfer to a physical robot.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes