CVLGDec 26, 2025

Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

arXiv:2512.21815v12 citationsh-index: 5
Originality Incremental advance
AI Analysis

This exposes critical safety risks in current vision-language models, potentially affecting their deployment in sensitive applications, and is incremental by building on prior entropy-based attacks.

The paper tackled the vulnerability of vision-language models to adversarial attacks by showing that focusing perturbations on a small fraction (about 20%) of high-entropy tokens, which are critical decision points in generation, leads to semantic degradation comparable to global methods with smaller budgets, achieving 35-49% harmful conversion rates and 93-95% attack success rates.

Vision-language models (VLMs) achieve remarkable performance but remain vulnerable to adversarial attacks. Entropy, a measure of model uncertainty, is strongly correlated with the reliability of VLM. Prior entropy-based attacks maximize uncertainty at all decoding steps, implicitly assuming that every token contributes equally to generation instability. We show instead that a small fraction (about 20%) of high-entropy tokens, i.e., critical decision points in autoregressive generation, disproportionately governs output trajectories. By concentrating adversarial perturbations on these positions, we achieve semantic degradation comparable to global methods while using substantially smaller budgets. More importantly, across multiple representative VLMs, such selective attacks convert 35-49% of benign outputs into harmful ones, exposing a more critical safety risk. Remarkably, these vulnerable high-entropy forks recur across architecturally diverse VLMs, enabling feasible transferability (17-26% harmful rates on unseen targets). Motivated by these findings, we propose Entropy-bank Guided Adversarial attacks (EGA), which achieves competitive attack success rates (93-95%) alongside high harmful conversion, thereby revealing new weaknesses in current VLM safety mechanisms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes