A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust Defense
This work addresses the problem of adversarial robustness in image classification for security-critical applications, representing an incremental improvement over prior encrypted model defenses.
The paper tackles the vulnerability of deep neural networks to adversarial examples by proposing a random ensemble of encrypted vision transformers, which enhances robustness against both white-box and black-box attacks. The method was tested on CIFAR-10 and ImageNet datasets, outperforming conventional defenses in clean and robust accuracy on the RobustBench benchmark.
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In previous studies, the use of models encrypted with a secret key was demonstrated to be robust against white-box attacks, but not against black-box ones. In this paper, we propose a novel method using the vision transformer (ViT) that is a random ensemble of encrypted models for enhancing robustness against both white-box and black-box attacks. In addition, a benchmark attack method, called AutoAttack, is applied to models to test adversarial robustness objectively. In experiments, the method was demonstrated to be robust against not only white-box attacks but also black-box ones in an image classification task on the CIFAR-10 and ImageNet datasets. The method was also compared with the state-of-the-art in a standardized benchmark for adversarial robustness, RobustBench, and it was verified to outperform conventional defenses in terms of clean accuracy and robust accuracy.