AI CV ROJan 21

AutoDriDM: An Explainable Benchmark for Decision-Making of Vision-Language Models in Autonomous Driving

Zecong Tang, Zixu Wang, Yifei Wang, Weitong Lian, Tianjian Gao, Haoran Li, Tengju Ru, Lingyi Meng, Zhejun Cui, Yichen Zhu, Qi Kang, Kaixuan Wang

arXiv:2601.14702v16.02 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the need for safer and more reliable vision-language models in autonomous driving by shifting evaluation focus from perception to decision-making, though it is incremental in benchmarking.

The authors tackled the problem of inadequate assessment of decision-making in vision-language models for autonomous driving by introducing AutoDriDM, a benchmark with 6,650 questions, revealing weak alignment between perception and decision performance and identifying key failure modes like logical reasoning errors.

Autonomous driving is a highly challenging domain that requires reliable perception and safe decision-making in complex scenarios. Recent vision-language models (VLMs) demonstrate reasoning and generalization abilities, opening new possibilities for autonomous driving; however, existing benchmarks and metrics overemphasize perceptual competence and fail to adequately assess decision-making processes. In this work, we present AutoDriDM, a decision-centric, progressive benchmark with 6,650 questions across three dimensions - Object, Scene, and Decision. We evaluate mainstream VLMs to delineate the perception-to-decision capability boundary in autonomous driving, and our correlation analysis reveals weak alignment between perception and decision-making performance. We further conduct explainability analyses of models' reasoning processes, identifying key failure modes such as logical reasoning errors, and introduce an analyzer model to automate large-scale annotation. AutoDriDM bridges the gap between perception-centered and decision-centered evaluation, providing guidance toward safer and more reliable VLMs for real-world autonomous driving.

View on arXiv PDF

Similar