CVJul 15, 2024
Comparing Optical Flow and Deep Learning to Enable Computationally Efficient Traffic Event Detection with Space-Filling CurvesTayssir Bouraffa, Elias Kjellberg Carlson, Erik Wessman et al.
Gathering data and identifying events in various traffic situations remains an essential challenge for the systematic evaluation of a perception system's performance. Analyzing large-scale, typically unstructured, multi-modal, time series data obtained from video, radar, and LiDAR is computationally demanding, particularly when meta-information or annotations are missing. We compare Optical Flow (OF) and Deep Learning (DL) to feed computationally efficient event detection via space-filling curves on video data from a forward-facing, in-vehicle camera. Our first approach leverages unexpected disturbances in the OF field from vehicle surroundings; the second approach is a DL model trained on human visual attention to predict a driver's gaze to spot potential event locations. We feed these results to a space-filling curve to reduce dimensionality and achieve computationally efficient event retrieval. We systematically evaluate our concept by obtaining characteristic patterns for both approaches from a large-scale virtual dataset (SMIRK) and applied our findings to the Zenseact Open Dataset (ZOD), a large multi-modal, real-world dataset, collected over two years in 14 different European countries. Our results yield that the OF approach excels in specificity and reduces false positives, while the DL approach demonstrates superior sensitivity. Both approaches offer comparable processing speed, making them suitable for real-time applications.
56.0CVMay 3
From Concept to Capability: Evaluating 3D Gaussian Splatting for Synthetic Scene Editing in Autonomous DrivingAli Nouri, Yifei Zhang, Yifan Zhang et al.
The perception of an Autonomous Driving System (ADS) critically depends on relevant, comprehensive, and diverse datasets to ensure its safety while operating in the environment. Field data collection lacks completeness with respect to the list of rare but still possible safety-related scenarios needed for the development, verification, and validation of the ADS. 3D Gaussian Splatting (3DGS) has shown promising capabilities for the reconstruction and editing of scenes based on data collected by cameras and LiDAR sensors. However, the industrial fidelity evaluation of reconstructions is underexplored, which is crucial when employing such methods in safety-related systems, especially for ADS. This becomes more challenging as ADS operates in a dynamic, uncontrolled environment with limited viewpoints and often partially occluded objects. This paper addresses this gap by proposing and implementing a framework (Fig. 1) to systematically analyze the capabilities and limitations of 3DGS for use in the reconstruction of safety-related scenes. It focuses on the quality of reconstruction for vehicles and pedestrians, which are the two most critical object classes for ADS. Our findings provide industry insights into the fidelity degradation of reconstructions from multiple novel viewpoints, both lateral and longitudinal, enabling the integration of these methods into real-world industrial AD software development and testing pipelines.