CVFeb 1, 2023

CertViT: Certified Robustness of Pre-Trained Vision Transformers

arXiv:2302.10287v18.48 citationsh-index: 6Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of scalable certified adversarial robustness for large pre-trained models like Vision Transformers, which is crucial for safety-critical applications, though it is incremental as it builds on existing Lipschitz bounding methods.

The paper tackles the problem of achieving certified robustness for large Vision Transformers, which was previously infeasible due to computational limitations, by introducing CertViT, a two-step proximal-projection method that yields better certified accuracy than state-of-the-art Lipschitz trained networks.

Lipschitz bounded neural networks are certifiably robust and have a good trade-off between clean and certified accuracy. Existing Lipschitz bounding methods train from scratch and are limited to moderately sized networks (< 6M parameters). They require a fair amount of hyper-parameter tuning and are computationally prohibitive for large networks like Vision Transformers (5M to 660M parameters). Obtaining certified robustness of transformers is not feasible due to the non-scalability and inflexibility of the current methods. This work presents CertViT, a two-step proximal-projection method to achieve certified robustness from pre-trained weights. The proximal step tries to lower the Lipschitz bound and the projection step tries to maintain the clean accuracy of pre-trained weights. We show that CertViT networks have better certified accuracy than state-of-the-art Lipschitz trained networks. We apply CertViT on several variants of pre-trained vision transformers and show adversarial robustness using standard attacks. Code : https://github.com/sagarverma/transformer-lipschitz

View on arXiv PDF Code

Similar