ROAIApr 17

From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation

arXiv:2604.1580521.6h-index: 4
Predicted impact top 27% in RO · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the costly data collection bottleneck in robot learning by providing a scalable simulation augmentation pipeline for generalizable policy learning and evaluation.

The paper introduces a generative framework that converts real-world panoramas into high-fidelity simulation scenes, enabling diverse data augmentation for robot learning. Experiments show that scaling up data generation with this method significantly improves generalization to unseen scenes and objects.

Learning robust robot policies in real-world environments requires diverse data augmentation, yet scaling real-world data collection is costly due to the need for acquiring physical assets and reconfiguring environments. Therefore, augmenting real-world scenes into simulation has become a practical augmentation for efficient learning and evaluation. We present a generative framework that establishes a generative real-to-sim mapping from real-world panoramas to high-fidelity simulation scenes, and further synthesize diverse cousin scenes via semantic and geometric editing. Combined with high-quality physics engines and realistic assets, the generated scenes support interactive manipulation tasks. Additionally, we incorporate multi-room stitching to construct consistent large-scale environments for long-horizon navigation across complex layouts. Experiments demonstrate a strong sim-to-real correlation validating our platform's fidelity, and show that extensively scaling up data generation leads to significantly better generalization to unseen scene and object variations, demonstrating the effectiveness of Digital Cousins for generalizable robot learning and evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes