CV AIJun 4, 2025

AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving

Lianming Huang, Haibo Hu, Yufei Cui, Jiacheng Zuo, Shangyu Wu, Nan Guan, Chun Jason Xue

arXiv:2506.05404v23.62 citationsh-index: 12

Originality Incremental advance

AI Analysis

This addresses real-time deployment challenges for autonomous driving systems, offering incremental improvements in efficiency and reliability.

The paper tackles the high latency and computational overhead of Vision-Language Models in autonomous driving by proposing AD-EE, an early exit framework that reduces latency by up to 57.58% and improves object detection accuracy by up to 44% on real-world datasets.

With the rapid advancement of autonomous driving, deploying Vision-Language Models (VLMs) to enhance perception and decision-making has become increasingly common. However, the real-time application of VLMs is hindered by high latency and computational overhead, limiting their effectiveness in time-critical driving scenarios. This challenge is particularly evident when VLMs exhibit over-inference, continuing to process unnecessary layers even after confident predictions have been reached. To address this inefficiency, we propose AD-EE, an Early Exit framework that incorporates domain characteristics of autonomous driving and leverages causal inference to identify optimal exit layers. We evaluate our method on large-scale real-world autonomous driving datasets, including Waymo and the corner-case-focused CODA, as well as on a real vehicle running the Autoware Universe platform. Extensive experiments across multiple VLMs show that our method significantly reduces latency, with maximum improvements reaching up to 57.58%, and enhances object detection accuracy, with maximum gains of up to 44%.

View on arXiv PDF

Similar