Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

David Fernandez, Pedram MohajerAnsari, Amir Salarpour, Mert D. Pese

arXiv:2604.2741435.3

Predicted impact top 82% in CV · last 90 daysOriginality Incremental advance

AI Analysis

It reveals a practical security risk for autonomous driving systems using VLMs, showing that adversarial attacks can transfer across architectures without knowledge of the target model.

The paper studies adversarial transferability across different VLM architectures for autonomous driving, finding high transfer rates (73-91%) and sustained frame-level manipulation (64.7-79.4%) even without target-specific optimization.

Vision-language models (VLMs) are increasingly used in autonomous driving because they combine visual perception with language-based reasoning, supporting more interpretable decision-making, yet their robustness to physical adversarial attacks, especially whether such attacks transfer across different VLM architectures, is not well understood and poses a practical risk when attackers do not know which model a vehicle uses. We address this gap with a systematic cross-architecture study of adversarial transferability in VLM-based driving, evaluating three representative architectures (Dolphins, OmniDrive, and LeapVAD) using physically realizable patches placed on roadside infrastructure in both crosswalk and highway scenarios. Our transfer-matrix evaluation shows high cross-architecture effectiveness, with transfer rates of 73-91% (mean TR = 0.815 for crosswalk and 0.833 for highway) and sustained frame-level manipulation over 64.7-79.4% of the critical decision window even when patches are not optimized for the target model.

View on arXiv PDF

Similar