MMLGJun 19, 2024

Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens

arXiv:2406.13294v13 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in adversarial attacks for vision-language models, offering incremental improvements for security testing.

The paper tackles the problem of adversarial images failing to deceive all prompts in vision-language models by proposing a Contextual-Injection Attack that improves cross-prompt transferability, achieving superior performance over existing methods on models like BLIP2, InstructBLIP, and LLaVA.

Vision-language models (VLMs) seamlessly integrate visual and textual data to perform tasks such as image classification, caption generation, and visual question answering. However, adversarial images often struggle to deceive all prompts effectively in the context of cross-prompt migration attacks, as the probability distribution of the tokens in these images tends to favor the semantics of the original image rather than the target tokens. To address this challenge, we propose a Contextual-Injection Attack (CIA) that employs gradient-based perturbation to inject target tokens into both visual and textual contexts, thereby improving the probability distribution of the target tokens. By shifting the contextual semantics towards the target tokens instead of the original image semantics, CIA enhances the cross-prompt transferability of adversarial images.Extensive experiments on the BLIP2, InstructBLIP, and LLaVA models show that CIA outperforms existing methods in cross-prompt transferability, demonstrating its potential for more effective adversarial strategies in VLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes