Zero-shot Hazard Identification in Autonomous Driving: A Case Study on the COOOL Benchmark
This work addresses the challenge of detecting out-of-label hazards in autonomous driving, which is crucial for safety, but it is incremental as it combines existing methods on a new benchmark.
The paper tackled the problem of zero-shot hazard identification in autonomous driving by integrating diverse methods for driver reaction detection, hazard object identification, and hazard captioning, resulting in a 33% reduction in relative error and a 2nd place finish on the COOOL benchmark leaderboard.
This paper presents our submission to the COOOL competition, a novel benchmark for detecting and classifying out-of-label hazards in autonomous driving. Our approach integrates diverse methods across three core tasks: (i) driver reaction detection, (ii) hazard object identification, and (iii) hazard captioning. We propose kernel-based change point detection on bounding boxes and optical flow dynamics for driver reaction detection to analyze motion patterns. For hazard identification, we combined a naive proximity-based strategy with object classification using a pre-trained ViT model. At last, for hazard captioning, we used the MOLMO vision-language model with tailored prompts to generate precise and context-aware descriptions of rare and low-resolution hazards. The proposed pipeline outperformed the baseline methods by a large margin, reducing the relative error by 33%, and scored 2nd on the final leaderboard consisting of 32 teams.