CV CR LGFeb 4, 2024

DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers

Oryan Yehezkel, Alon Zolfi, Amit Baras, Yuval Elovici, Asaf Shabtai

arXiv:2402.02554v25.22 citationsh-index: 73Has CodeNIPS

Originality Incremental advance

AI Analysis

This addresses a security problem for users of efficient vision transformers, highlighting a critical threat to model availability.

The paper tackles the vulnerability of token sparsification mechanisms in vision transformers to adversarial attacks, presenting DeSparsify, which successfully exhausts system resources while remaining stealthy, as demonstrated in evaluations on three mechanisms.

Vision transformers have contributed greatly to advancements in the computer vision domain, demonstrating state-of-the-art performance in diverse tasks (e.g., image classification, object detection). However, their high computational requirements grow quadratically with the number of tokens used. Token sparsification mechanisms have been proposed to address this issue. These mechanisms employ an input-dependent strategy, in which uninformative tokens are discarded from the computation pipeline, improving the model's efficiency. However, their dynamism and average-case assumption makes them vulnerable to a new threat vector - carefully crafted adversarial examples capable of fooling the sparsification mechanism, resulting in worst-case performance. In this paper, we present DeSparsify, an attack targeting the availability of vision transformers that use token sparsification mechanisms. The attack aims to exhaust the operating system's resources, while maintaining its stealthiness. Our evaluation demonstrates the attack's effectiveness on three token sparsification mechanisms and examines the attack's transferability between them and its effect on the GPU resources. To mitigate the impact of the attack, we propose various countermeasures.

View on arXiv PDF Code

Similar