CV CR LGApr 4, 2024

Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks

Lei Zhang, Yuhang Zhou, Yi Yang, Xinbo Gao

arXiv:2404.03340v110.517 citationsh-index: 4IEEE Trans Pattern Anal Mach Intell

Originality Incremental advance

AI Analysis

This addresses the critical security issue of adversarial attacks in computer vision, offering a defense method that works against unknown attacks, though it is incremental as it builds on existing distillation and meta-learning techniques.

The paper tackles the problem of deep neural networks' vulnerability to unknown adversarial attacks by proposing Meta Invariance Defense (MID), which achieves generalizable robustness with verified superiority on benchmarks like ImageNet.

Despite providing high-performance solutions for computer vision tasks, the deep neural network (DNN) model has been proved to be extremely vulnerable to adversarial attacks. Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked. Besides, commonly used adaptive learning and fine-tuning technique is unsuitable for adversarial defense since it is essentially a zero-shot problem when deployed. Thus, to tackle this challenge, we propose an attack-agnostic defense method named Meta Invariance Defense (MID). Specifically, various combinations of adversarial attacks are randomly sampled from a manually constructed Attacker Pool to constitute different defense tasks against unknown attacks, in which a student encoder is supervised by multi-consistency distillation to learn the attack-invariant features via a meta principle. The proposed MID has two merits: 1) Full distillation from pixel-, feature- and prediction-level between benign and adversarial samples facilitates the discovery of attack-invariance. 2) The model simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration. Theoretical and empirical studies on numerous benchmarks such as ImageNet verify the generalizable robustness and superiority of MID under various attacks.

View on arXiv PDF

Similar