RO LGFeb 4, 2024

PoCo: Policy Composition from and for Heterogeneous Robot Learning

Lirui Wang, Jialiang Zhao, Yilun Du, Edward H. Adelson, Russ Tedrake

MIT

arXiv:2402.02511v326.264 citationsh-index: 87Robotics: Science and Systems

Originality Incremental advance

AI Analysis

This addresses the problem of expensive and difficult policy training for robotics by enabling flexible composition across diverse data sources, representing an incremental improvement over existing methods.

The paper tackles the challenge of training general robotic policies from heterogeneous data across different modalities and domains by introducing Policy Composition, which composes data distributions using diffusion models, achieving robust and dexterous performance in tool-use tasks and outperforming baselines from single data sources in simulation and real-world experiments.

Training general robotic policies from heterogeneous data for different tasks is a significant challenge. Existing robotic datasets vary in different modalities such as color, depth, tactile, and proprioceptive information, and collected in different domains such as simulation, real robots, and human videos. Current methods usually collect and pool all data from one domain to train a single policy to handle such heterogeneity in tasks and domains, which is prohibitively expensive and difficult. In this work, we present a flexible approach, dubbed Policy Composition, to combine information across such diverse modalities and domains for learning scene-level and task-level generalized manipulation skills, by composing different data distributions represented with diffusion models. Our method can use task-level composition for multi-task manipulation and be composed with analytic cost functions to adapt policy behaviors at inference time. We train our method on simulation, human, and real robot data and evaluate in tool-use tasks. The composed policy achieves robust and dexterous performance under varying scenes and tasks and outperforms baselines from a single data source in both simulation and real-world experiments. See https://liruiw.github.io/policycomp for more details .

View on arXiv PDF

Similar