LGAICVMar 13, 2025

Robustness Tokens: Towards Adversarial Robustness of Transformers

arXiv:2503.10191v12 citationsh-index: 15ECCV
Originality Incremental advance
AI Analysis

This addresses security concerns for practitioners using pre-trained transformers in downstream tasks, though it is incremental as it builds on existing adversarial training methods.

The paper tackles the vulnerability of publicly available pre-trained transformer models to adversarial attacks by proposing Robustness Tokens, a method that fine-tunes a few private tokens instead of model parameters, resulting in significantly improved robustness to white-box attacks while maintaining original downstream performance.

Recently, large pre-trained foundation models have become widely adopted by machine learning practitioners for a multitude of tasks. Given that such models are publicly available, relying on their use as backbone models for downstream tasks might result in high vulnerability to adversarial attacks crafted with the same public model. In this work, we propose Robustness Tokens, a novel approach specific to the transformer architecture that fine-tunes a few additional private tokens with low computational requirements instead of tuning model parameters as done in traditional adversarial training. We show that Robustness Tokens make Vision Transformer models significantly more robust to white-box adversarial attacks while also retaining the original downstream performances.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes