EO-VLM: VLM-Guided Energy Overload Attacks on Vision Models
This addresses a critical vulnerability in vision models used in applications like autonomous driving and CCTV monitoring, posing a threat to system availability.
The paper tackles the susceptibility of vision models to resource-consuming attacks by introducing EO-VLM, a novel energy-overloading attack that uses VLM prompts to generate adversarial images, resulting in up to a 50% increase in GPU energy consumption.
Vision models are increasingly deployed in critical applications such as autonomous driving and CCTV monitoring, yet they remain susceptible to resource-consuming attacks. In this paper, we introduce a novel energy-overloading attack that leverages vision language model (VLM) prompts to generate adversarial images targeting vision models. These images, though imperceptible to the human eye, significantly increase GPU energy consumption across various vision models, threatening the availability of these systems. Our framework, EO-VLM (Energy Overload via VLM), is model-agnostic, meaning it is not limited by the architecture or type of the target vision model. By exploiting the lack of safety filters in VLMs like DALL-E 3, we create adversarial noise images without requiring prior knowledge or internal structure of the target vision models. Our experiments demonstrate up to a 50% increase in energy consumption, revealing a critical vulnerability in current vision models.