CV LGMar 22, 2023

Test-time Detection and Repair of Adversarial Samples via Masked Autoencoder

Yun-Yun Tsai, Ju-Chin Chao, Albert Wen, Zhaoyuan Yang, Chengzhi Mao, Tapan Shah, Junfeng Yang

arXiv:2303.12848v36.84 citationsh-index: 14

Originality Highly original

AI Analysis

This addresses the need for efficient and generalizable adversarial defenses in machine learning, offering a novel test-time solution that works on frozen models.

The paper tackles the problem of defending against adversarial attacks at test time without adapting model weights, proposing DRAM which uses a masked autoencoder to detect and repair adversarial samples, achieving an 82% average detection rate on ImageNet and improving robust accuracy by up to 41%.

Training-time defenses, known as adversarial training, incur high training costs and do not generalize to unseen attacks. Test-time defenses solve these issues but most existing test-time defenses require adapting the model weights, therefore they do not work on frozen models and complicate model memory management. The only test-time defense that does not adapt model weights aims to adapt the input with self-supervision tasks. However, we empirically found these self-supervision tasks are not sensitive enough to detect adversarial attacks accurately. In this paper, we propose DRAM, a novel defense method to detect and repair adversarial samples at test time via Masked autoencoder (MAE). We demonstrate how to use MAE losses to build a Kolmogorov-Smirnov test to detect adversarial samples. Moreover, we use the MAE losses to calculate input reversal vectors that repair adversarial samples resulting from previously unseen attacks. Results on large-scale ImageNet dataset show that, compared to all detection baselines evaluated, DRAM achieves the best detection rate (82% on average) on all eight adversarial attacks evaluated. For attack repair, DRAM improves the robust accuracy by 6% ~ 41% for standard ResNet50 and 3% ~ 8% for robust ResNet50 compared with the baselines that use contrastive learning and rotation prediction.

View on arXiv PDF

Similar