The Road Ahead in Autonomous Driving: The KITScenes Multimodal Dataset
For autonomous driving researchers, this dataset addresses gaps in sensor fidelity, map completeness, and geographic diversity, but is an incremental contribution as it follows established dataset paradigms.
KITScenes Multimodal introduces a high-fidelity autonomous driving dataset with long-range lidar, 4D radar, and the most complete HD maps to date, validated through open-source driving trials. It provides four benchmarks for spatial learning tasks, including online HD map construction and long-range depth estimation.
Existing autonomous driving datasets have enabled major progress, but fall short in sensor fidelity, map completeness, or geographic diversity. We present KITScenes Multimodal, a European dataset built around high-fidelity sensors and maps. Our fully synchronized sensor suite combines high-resolution global-shutter cameras, long-range lidar beyond 400m, 4D imaging radar, and redundant GNSS/INS localization. Our HD maps are, to our knowledge, the most complete of any sensor dataset, validated through autonomous driving trials on open-source software. For the first time in a public dataset, all driving-relevant traffic elements, such as traffic lights, are mapped in 3D to a reprojection-accurate level with full topological connectivity. Recorded in cities with irregular street layouts and mixed traffic modes, our dataset complements existing datasets by broadening the available geographic diversity. We also introduce four benchmarks, each advancing spatial learning for embodied AI: online HD map construction, long-range depth estimation, novel view synthesis, and end-to-end driving. Project page: https://kitscenes.com/