CVLGMar 22, 2023

Test-time Detection and Repair of Adversarial Samples via Masked Autoencoder

arXiv:2303.12848v34 citationsh-index: 14
Originality Highly original
AI Analysis

This addresses the need for efficient and generalizable adversarial defenses in machine learning, offering a novel test-time solution that works on frozen models.

The paper tackles the problem of defending against adversarial attacks at test time without adapting model weights, proposing DRAM which uses a masked autoencoder to detect and repair adversarial samples, achieving an 82% average detection rate on ImageNet and improving robust accuracy by up to 41%.

Training-time defenses, known as adversarial training, incur high training costs and do not generalize to unseen attacks. Test-time defenses solve these issues but most existing test-time defenses require adapting the model weights, therefore they do not work on frozen models and complicate model memory management. The only test-time defense that does not adapt model weights aims to adapt the input with self-supervision tasks. However, we empirically found these self-supervision tasks are not sensitive enough to detect adversarial attacks accurately. In this paper, we propose DRAM, a novel defense method to detect and repair adversarial samples at test time via Masked autoencoder (MAE). We demonstrate how to use MAE losses to build a Kolmogorov-Smirnov test to detect adversarial samples. Moreover, we use the MAE losses to calculate input reversal vectors that repair adversarial samples resulting from previously unseen attacks. Results on large-scale ImageNet dataset show that, compared to all detection baselines evaluated, DRAM achieves the best detection rate (82% on average) on all eight adversarial attacks evaluated. For attack repair, DRAM improves the robust accuracy by 6% ~ 41% for standard ResNet50 and 3% ~ 8% for robust ResNet50 compared with the baselines that use contrastive learning and rotation prediction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes