CVCRLGFeb 4, 2024

DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers

arXiv:2402.02554v22 citationsh-index: 73NIPS
AI Analysis

This addresses a security problem for users of efficient vision transformers, highlighting a critical threat to model availability.

The paper tackles the vulnerability of token sparsification mechanisms in vision transformers to adversarial attacks, presenting DeSparsify, which successfully exhausts system resources while remaining stealthy, as demonstrated in evaluations on three mechanisms.

Vision transformers have contributed greatly to advancements in the computer vision domain, demonstrating state-of-the-art performance in diverse tasks (e.g., image classification, object detection). However, their high computational requirements grow quadratically with the number of tokens used. Token sparsification mechanisms have been proposed to address this issue. These mechanisms employ an input-dependent strategy, in which uninformative tokens are discarded from the computation pipeline, improving the model's efficiency. However, their dynamism and average-case assumption makes them vulnerable to a new threat vector - carefully crafted adversarial examples capable of fooling the sparsification mechanism, resulting in worst-case performance. In this paper, we present DeSparsify, an attack targeting the availability of vision transformers that use token sparsification mechanisms. The attack aims to exhaust the operating system's resources, while maintaining its stealthiness. Our evaluation demonstrates the attack's effectiveness on three token sparsification mechanisms and examines the attack's transferability between them and its effect on the GPU resources. To mitigate the impact of the attack, we propose various countermeasures.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes