RO AI CV HCJul 14, 2025

Scene-Aware Conversational ADAS with Generative AI for Real-Time Driver Assistance

Kyungtae Han, Yitao Chen, Rohit Gupta, Onur Altintas

arXiv:2507.10500v1h-index: 8

Originality Incremental advance

AI Analysis

This addresses the need for more flexible and interactive driver assistance systems for users in dynamic driving environments, though it is incremental as it builds on existing Generative AI and ADAS technologies.

The paper tackled the problem of limited scene interpretation and natural language interaction in Advanced Driver Assistance Systems (ADAS) by introducing SC-ADAS, a modular framework that integrates Generative AI to enable real-time, interpretable, and adaptive driver assistance through multi-turn dialogue grounded in visual and sensor context, with evaluation in the CARLA simulator showing feasibility but trade-offs like increased latency and token growth.

While autonomous driving technologies continue to advance, current Advanced Driver Assistance Systems (ADAS) remain limited in their ability to interpret scene context or engage with drivers through natural language. These systems typically rely on predefined logic and lack support for dialogue-based interaction, making them inflexible in dynamic environments or when adapting to driver intent. This paper presents Scene-Aware Conversational ADAS (SC-ADAS), a modular framework that integrates Generative AI components including large language models, vision-to-text interpretation, and structured function calling to enable real-time, interpretable, and adaptive driver assistance. SC-ADAS supports multi-turn dialogue grounded in visual and sensor context, allowing natural language recommendations and driver-confirmed ADAS control. Implemented in the CARLA simulator with cloud-based Generative AI, the system executes confirmed user intents as structured ADAS commands without requiring model fine-tuning. We evaluate SC-ADAS across scene-aware, conversational, and revisited multi-turn interactions, highlighting trade-offs such as increased latency from vision-based context retrieval and token growth from accumulated dialogue history. These results demonstrate the feasibility of combining conversational reasoning, scene perception, and modular ADAS control to support the next generation of intelligent driver assistance.

View on arXiv PDF

Similar