CVSep 24, 2024

Adversarial Backdoor Defense in CLIP

Junhao Kuang, Siyuan Liang, Jiawei Liang, Kuanrong Liu, Xiaochun Cao

arXiv:2409.15968v114.112 citationsh-index: 24

Originality Incremental advance

AI Analysis

This addresses a security problem for users of CLIP and similar models by providing a robust defense against backdoor attacks, though it is incremental as it builds on existing defense methods.

The paper tackles the vulnerability of multimodal contrastive pretraining models like CLIP to backdoor attacks by proposing Adversarial Backdoor Defense (ABD), a data augmentation strategy that reduces attack success rates by up to 53.64% while maintaining clean accuracy with only a 1.73% average decrease.

Multimodal contrastive pretraining, exemplified by models like CLIP, has been found to be vulnerable to backdoor attacks. While current backdoor defense methods primarily employ conventional data augmentation to create augmented samples aimed at feature alignment, these methods fail to capture the distinct features of backdoor samples, resulting in suboptimal defense performance. Observations reveal that adversarial examples and backdoor samples exhibit similarities in the feature space within the compromised models. Building on this insight, we propose Adversarial Backdoor Defense (ABD), a novel data augmentation strategy that aligns features with meticulously crafted adversarial examples. This approach effectively disrupts the backdoor association. Our experiments demonstrate that ABD provides robust defense against both traditional uni-modal and multimodal backdoor attacks targeting CLIP. Compared to the current state-of-the-art defense method, CleanCLIP, ABD reduces the attack success rate by 8.66% for BadNet, 10.52% for Blended, and 53.64% for BadCLIP, while maintaining a minimal average decrease of just 1.73% in clean accuracy.

View on arXiv PDF

Similar