ROAIApr 12

AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence

arXiv:2604.1057966.3h-index: 6
Predicted impact top 28% in RO · last 90 daysOriginality Incremental advance
AI Analysis

For robot learning, this framework addresses data diversity constraints in imitation learning, enabling generalization to unseen objects with improved data efficiency.

AffordGen uses 3D generative models and vision foundation models to generate diverse manipulation trajectories via affordance correspondence, enabling a visuomotor policy that achieves high success rates and zero-shot generalization to unseen objects, improving data efficiency in robot learning.

Despite the recent success of modern imitation learning methods in robot manipulation, their performance is often constrained by geometric variations due to limited data diversity. Leveraging powerful 3D generative models and vision foundation models (VFMs), the proposed AffordGen framework overcomes this limitation by utilizing the semantic correspondence of meaningful keypoints across large-scale 3D meshes to generate new robot manipulation trajectories. This large-scale, affordance-aware dataset is then used to train a robust, closed-loop visuomotor policy, combining the semantic generalizability of affordances with the reactive robustness of end-to-end learning. Experiments in simulation and the real world show that policies trained with AffordGen achieve high success rates and enable zero-shot generalization to truly unseen objects, significantly improving data efficiency in robot learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes