Image and Model Transformation with Secret Key for Vision Transformer
This addresses model protection for vision transformer users, but it is incremental as it builds on existing encryption and ViT techniques.
The paper tackles the problem of protecting vision transformer models by proposing a method to transform both images and models with a secret key, enabling models trained on plain images to work directly with encrypted images without performance loss, achieving the same accuracy as plain-image models on encrypted test images.
In this paper, we propose a combined use of transformed images and vision transformer (ViT) models transformed with a secret key. We show for the first time that models trained with plain images can be directly transformed to models trained with encrypted images on the basis of the ViT architecture, and the performance of the transformed models is the same as models trained with plain images when using test images encrypted with the key. In addition, the proposed scheme does not require any specially prepared data for training models or network modification, so it also allows us to easily update the secret key. In an experiment, the effectiveness of the proposed scheme is evaluated in terms of performance degradation and model protection performance in an image classification task on the CIFAR-10 dataset.