CVSep 3, 2025

UrbanTwin: Building High-Fidelity Digital Twins for Sim2Real LiDAR Perception and Evaluation

arXiv:2509.02903v2h-index: 1Has Code
Originality Incremental advance
AI Analysis

This enables scalable and cost-effective data generation for Sim2Real learning in ITS, though it is incremental as it builds on existing simulation methods.

The paper tackles the problem of expensive and labor-intensive LiDAR dataset creation for intelligent transportation systems by introducing a reproducible workflow for building high-fidelity digital twins to generate realistic synthetic datasets, resulting in three synthetic LiDAR datasets that outperform real-data-trained baselines in perception tasks.

LiDAR-based perception in intelligent transportation systems (ITS) relies on deep neural networks trained with large-scale labeled datasets. However, creating such datasets is expensive, time-consuming, and labor-intensive, limiting the scalability of perception systems. Sim2Real learning offers a scalable alternative, but its success depends on the simulation's fidelity to real-world environments, dynamics, and sensors. This tutorial introduces a reproducible workflow for building high-fidelity digital twins (HiFi DTs) to generate realistic synthetic datasets. We outline practical steps for modeling static geometry, road infrastructure, and dynamic traffic using open-source resources such as satellite imagery, OpenStreetMap, and sensor specifications. The resulting environments support scalable and cost-effective data generation for robust Sim2Real learning. Using this workflow, we have released three synthetic LiDAR datasets, namely UT-LUMPI, UT-V2X-Real, and UT-TUMTraf-I, which closely replicate real locations and outperform real-data-trained baselines in perception tasks. This guide enables broader adoption of HiFi DTs in ITS research and deployment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes