RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications
This addresses the problem of bridging the sim-to-real gap for depth data in robotic vision applications, representing an incremental advancement by focusing on a specific data modality.
The paper tackles the sim-to-real domain gap in robotic vision by developing RaSim, a range-aware RGB-D data simulation pipeline that generates high-fidelity depth data, enabling models trained with it to be directly applied to real-world scenarios without finetuning and excel at downstream tasks.
In robotic vision, a de-facto paradigm is to learn in simulated environments and then transfer to real-world applications, which poses an essential challenge in bridging the sim-to-real domain gap. While mainstream works tackle this problem in the RGB domain, we focus on depth data synthesis and develop a range-aware RGB-D data simulation pipeline (RaSim). In particular, high-fidelity depth data is generated by imitating the imaging principle of real-world sensors. A range-aware rendering strategy is further introduced to enrich data diversity. Extensive experiments show that models trained with RaSim can be directly applied to real-world scenarios without any finetuning and excel at downstream RGB-D perception tasks.