CVAIHCFeb 6, 2025

Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation

arXiv:2502.06843v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the need for better reasoning in autonomous driving for drivers, but it is incremental as it builds on existing models like YOLOv4, ViT, and GPT-4.

The study tackled the problem of autonomous driving systems struggling with reasoning in complex scenarios by introducing an LLM-based assistance system integrating a vision adapter and GPT-4, which closely mirrored human performance in situation description and moderately aligned in decision-making.

Traditional autonomous driving systems often struggle with reasoning in complex, unexpected scenarios due to limited comprehension of spatial relationships. In response, this study introduces a Large Language Model (LLM)-based Autonomous Driving (AD) assistance system that integrates a vision adapter and an LLM reasoning module to enhance visual understanding and decision-making. The vision adapter, combining YOLOv4 and Vision Transformer (ViT), extracts comprehensive visual features, while GPT-4 enables human-like spatial reasoning and response generation. Experimental evaluations with 45 experienced drivers revealed that the system closely mirrors human performance in describing situations and moderately aligns with human decisions in generating appropriate responses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes