AIJul 29, 2025

SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval-Augmented Generation

Hao Ye, Mengshi Qi, Zhaohong Liu, Liang Liu, Huadong Ma

arXiv:2507.21585v111 citationsh-index: 29Has CodeMM

Originality Incremental advance

AI Analysis

This addresses the safety evaluation gap for autonomous driving systems, though it is incremental as it builds on existing VLM and RAG techniques.

The authors tackled the problem of evaluating vision-language models in traffic safety-critical autonomous driving scenarios by creating the SafeDrive228K benchmark and proposing SafeDriveRAG, a knowledge graph-based retrieval-augmented generation method. Their approach improved performance by +4.73% to +14.57% across different safety tasks on five mainstream VLMs.

In this work, we study how vision-language models (VLMs) can be utilized to enhance the safety for the autonomous driving system, including perception, situational understanding, and path planning. However, existing research has largely overlooked the evaluation of these models in traffic safety-critical driving scenarios. To bridge this gap, we create the benchmark (SafeDrive228K) and propose a new baseline based on VLM with knowledge graph-based retrieval-augmented generation (SafeDriveRAG) for visual question answering (VQA). Specifically, we introduce SafeDrive228K, the first large-scale multimodal question-answering benchmark comprising 228K examples across 18 sub-tasks. This benchmark encompasses a diverse range of traffic safety queries, from traffic accidents and corner cases to common safety knowledge, enabling a thorough assessment of the comprehension and reasoning abilities of the models. Furthermore, we propose a plug-and-play multimodal knowledge graph-based retrieval-augmented generation approach that employs a novel multi-scale subgraph retrieval algorithm for efficient information retrieval. By incorporating traffic safety guidelines collected from the Internet, this framework further enhances the model's capacity to handle safety-critical situations. Finally, we conduct comprehensive evaluations on five mainstream VLMs to assess their reliability in safety-sensitive driving tasks. Experimental results demonstrate that integrating RAG significantly improves performance, achieving a +4.73% gain in Traffic Accidents tasks, +8.79% in Corner Cases tasks and +14.57% in Traffic Safety Commonsense across five mainstream VLMs, underscoring the potential of our proposed benchmark and methodology for advancing research in traffic safety. Our source code and data are available at https://github.com/Lumos0507/SafeDriveRAG.

View on arXiv PDF Code

Similar