ROCVMar 28, 2024

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents

arXiv:2403.19622v216 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This addresses the data scarcity problem for researchers developing composable generalization agents in robotics, though it is incremental as it builds on existing VLM-based planning approaches.

The authors tackled the lack of primitive-level data for robotic manipulation by introducing RH20T-P, a dataset with 38k annotated video clips across 67 tasks, and demonstrated its utility through a baseline agent that shows positive performance on unseen tasks.

Achieving generalizability in solving out-of-distribution tasks is one of the ultimate goals of learning robotic manipulation. Recent progress of Vision-Language Models (VLMs) has shown that VLM-based task planners can alleviate the difficulty of solving novel tasks, by decomposing the compounded tasks as a plan of sequentially executing primitive-level skills that have been already mastered. It is also promising for robotic manipulation to adapt such composable generalization ability, in the form of composable generalization agents (CGAs). However, the community lacks of reliable design of primitive skills and a sufficient amount of primitive-level data annotations. Therefore, we propose RH20T-P, a primitive-level robotic manipulation dataset, which contains about 38k video clips covering 67 diverse manipulation tasks in real-world scenarios. Each clip is manually annotated according to a set of meticulously designed primitive skills that are common in robotic manipulation. Furthermore, we standardize a plan-execute CGA paradigm and implement an exemplar baseline called RA-P on our RH20T-P, whose positive performance on solving unseen tasks validates that the proposed dataset can offer composable generalization ability to robotic manipulation agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes