Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI
This work addresses the bottleneck of reliable benchmarking for open-world embodied AI, such as visual navigation, by offering a geometrically grounded simulation framework, though it is incremental in improving existing simulation methods.
The paper tackles the challenge of reproducible closed-loop evaluation in Embodied AI by introducing Wanderland, a real-to-sim framework that provides high-fidelity simulation with accurate geometry and robust view synthesis, resulting in a dataset that demonstrates how poor geometry adversely affects navigation policy learning and evaluation reliability.
Reproducible closed-loop evaluation remains a major bottleneck in Embodied AI such as visual navigation. A promising path forward is high-fidelity simulation that combines photorealistic sensor rendering with geometrically grounded interaction in complex, open-world urban environments. Although recent video-3DGS methods ease open-world scene capturing, they are still unsuitable for benchmarking due to large visual and geometric sim-to-real gaps. To address these challenges, we introduce Wanderland, a real-to-sim framework that features multi-sensor capture, reliable reconstruction, accurate geometry, and robust view synthesis. Using this pipeline, we curate a diverse dataset of indoor-outdoor urban scenes and systematically demonstrate how image-only pipelines scale poorly, how geometry quality impacts novel view synthesis, and how all of these adversely affect navigation policy learning and evaluation reliability. Beyond serving as a trusted testbed for embodied navigation, Wanderland's rich raw sensor data further allows benchmarking of 3D reconstruction and novel view synthesis models. Our work establishes a new foundation for reproducible research in open-world embodied AI. Project website is at https://ai4ce.github.io/wanderland/.