Evaluating the Adversarial Robustness of Adaptive Test-time Defenses
This work addresses the problem of evaluating adversarial robustness for researchers, showing that current adaptive defenses are incremental and not yet effective.
The paper evaluated adaptive test-time defenses for adversarial robustness in image classification and found that none significantly improved upon static defenses, with some even weakening the model and increasing computation.
Adaptive defenses, which optimize at test time, promise to improve adversarial robustness. We categorize such adaptive test-time defenses, explain their potential benefits and drawbacks, and evaluate a representative variety of the latest adaptive defenses for image classification. Unfortunately, none significantly improve upon static defenses when subjected to our careful case study evaluation. Some even weaken the underlying static model while simultaneously increasing inference computation. While these results are disappointing, we still believe that adaptive test-time defenses are a promising avenue of research and, as such, we provide recommendations for their thorough evaluation. We extend the checklist of Carlini et al. (2019) by providing concrete steps specific to adaptive defenses.