CRAILGFeb 5, 2024

Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models

arXiv:2402.06659v258 citationsh-index: 15Has CodeNIPS
Originality Highly original
AI Analysis

This work exposes a security risk for VLM users by enabling deceptive misinformation through poisoning, with implications for responsible AI deployment.

The study tackled the vulnerability of Vision-Language Models (VLMs) to stealthy data poisoning attacks, introducing Shadowcast, which uses visually indistinguishable poison samples to manipulate responses, achieving effectiveness with as few as 50 samples and demonstrating transferability across architectures.

Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, but their versatility raises security concerns. This study takes the first step in exposing VLMs' susceptibility to data poisoning attacks that can manipulate responses to innocuous, everyday prompts. We introduce Shadowcast, a stealthy data poisoning attack where poison samples are visually indistinguishable from benign images with matching texts. Shadowcast demonstrates effectiveness in two attack types. The first is a traditional Label Attack, tricking VLMs into misidentifying class labels, such as confusing Donald Trump for Joe Biden. The second is a novel Persuasion Attack, leveraging VLMs' text generation capabilities to craft persuasive and seemingly rational narratives for misinformation, such as portraying junk food as healthy. We show that Shadowcast effectively achieves the attacker's intentions using as few as 50 poison samples. Crucially, the poisoned samples demonstrate transferability across different VLM architectures, posing a significant concern in black-box settings. Moreover, Shadowcast remains potent under realistic conditions involving various text prompts, training data augmentation, and image compression techniques. This work reveals how poisoned VLMs can disseminate convincing yet deceptive misinformation to everyday, benign users, emphasizing the importance of data integrity for responsible VLM deployments. Our code is available at: https://github.com/umd-huang-lab/VLM-Poisoning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes