CVAIJun 4, 2025

AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving

arXiv:2506.05404v22 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses real-time deployment challenges for autonomous driving systems, offering incremental improvements in efficiency and reliability.

The paper tackles the high latency and computational overhead of Vision-Language Models in autonomous driving by proposing AD-EE, an early exit framework that reduces latency by up to 57.58% and improves object detection accuracy by up to 44% on real-world datasets.

With the rapid advancement of autonomous driving, deploying Vision-Language Models (VLMs) to enhance perception and decision-making has become increasingly common. However, the real-time application of VLMs is hindered by high latency and computational overhead, limiting their effectiveness in time-critical driving scenarios. This challenge is particularly evident when VLMs exhibit over-inference, continuing to process unnecessary layers even after confident predictions have been reached. To address this inefficiency, we propose AD-EE, an Early Exit framework that incorporates domain characteristics of autonomous driving and leverages causal inference to identify optimal exit layers. We evaluate our method on large-scale real-world autonomous driving datasets, including Waymo and the corner-case-focused CODA, as well as on a real vehicle running the Autoware Universe platform. Extensive experiments across multiple VLMs show that our method significantly reduces latency, with maximum improvements reaching up to 57.58%, and enhances object detection accuracy, with maximum gains of up to 44%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes