BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
This addresses a long-standing challenge in robot learning by improving zero-shot task generalization for robotic systems, though it is incremental in scaling existing methods.
The paper tackles the problem of enabling vision-based robotic manipulation systems to generalize to novel tasks by scaling and broadening imitation learning data, achieving a 44% average success rate on 24 unseen tasks without robot demonstrations.
In this paper, we study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks, a long-standing challenge in robot learning. We approach the challenge from an imitation learning perspective, aiming to study how scaling and broadening the data collected can facilitate such generalization. To that end, we develop an interactive and flexible imitation learning system that can learn from both demonstrations and interventions and can be conditioned on different forms of information that convey the task, including pre-trained embeddings of natural language or videos of humans performing the task. When scaling data collection on a real robot to more than 100 distinct tasks, we find that this system can perform 24 unseen manipulation tasks with an average success rate of 44%, without any robot demonstrations for those tasks.