ROCVLGMar 7, 2024

Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs

arXiv:2403.04114v15 citationsh-index: 7ICRA
Originality Incremental advance
AI Analysis

This addresses the challenge of sim-to-real transfer for robotic systems by reducing manual tuning and brittleness, though it appears incremental as it builds on existing NeRF methods.

The paper tackles the problem of expensive real-world training data for robotic perception by introducing COV-NeRF, an object-composable NeRF model that extracts objects from real images and composes them into new scenes to generate photorealistic renderings and supervision, resulting in rapid closure of the sim-to-real gap across perceptual modalities.

Deep learning methods for perception are the cornerstone of many robotic systems. Despite their potential for impressive performance, obtaining real-world training data is expensive, and can be impractically difficult for some tasks. Sim-to-real transfer with domain randomization offers a potential workaround, but often requires extensive manual tuning and results in models that are brittle to distribution shift between sim and real. In this work, we introduce Composable Object Volume NeRF (COV-NeRF), an object-composable NeRF model that is the centerpiece of a real-to-sim pipeline for synthesizing training data targeted to scenes and objects from the real world. COV-NeRF extracts objects from real images and composes them into new scenes, generating photorealistic renderings and many types of 2D and 3D supervision, including depth maps, segmentation masks, and meshes. We show that COV-NeRF matches the rendering quality of modern NeRF methods, and can be used to rapidly close the sim-to-real gap across a variety of perceptual modalities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes