CVAIFeb 1, 2025

INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation

arXiv:2502.00262v311 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses safety challenges in autonomous driving by enhancing hazard detection for edge cases, representing an incremental advancement through supervised fine-tuning of vision-language models.

The paper tackles the problem of autonomous driving systems struggling with unpredictable edge-case scenarios by proposing INSIGHT, a hierarchical vision-language model framework that integrates semantic and visual inputs for hazard detection and evaluation. Experimental results on the BDD100K dataset show substantial improvements in hazard prediction accuracy and generalization performance over existing models.

Autonomous driving systems face significant challenges in handling unpredictable edge-case scenarios, such as adversarial pedestrian movements, dangerous vehicle maneuvers, and sudden environmental changes. Current end-to-end driving models struggle with generalization to these rare events due to limitations in traditional detection and prediction approaches. To address this, we propose INSIGHT (Integration of Semantic and Visual Inputs for Generalized Hazard Tracking), a hierarchical vision-language model (VLM) framework designed to enhance hazard detection and edge-case evaluation. By using multimodal data fusion, our approach integrates semantic and visual representations, enabling precise interpretation of driving scenarios and accurate forecasting of potential dangers. Through supervised fine-tuning of VLMs, we optimize spatial hazard localization using attention-based mechanisms and coordinate regression techniques. Experimental results on the BDD100K dataset demonstrate a substantial improvement in hazard prediction straightforwardness and accuracy over existing models, achieving a notable increase in generalization performance. This advancement enhances the robustness and safety of autonomous driving systems, ensuring improved situational awareness and potential decision-making in complex real-world scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes