CR AI LGNov 8, 2018

New CleverHans Feature: Better Adversarial Robustness Evaluations with Attack Bundling

arXiv:1811.03685v19.65 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for researchers evaluating adversarial robustness in machine learning models.

The paper tackles the problem of underestimating worst-case adversarial robustness by proposing 'attack bundling' in the CleverHans library, which calculates error rates by maximizing across attacks per example before averaging, showing that traditional methods can underestimate by up to 100% as attack numbers increase.

This technical report describes a new feature of the CleverHans library called "attack bundling". Many papers about adversarial examples present lists of error rates corresponding to different attack algorithms. A common approach is to take the maximum across this list and compare defenses against that error rate. We argue that a better approach is to use attack bundling: the max should be taken across many examples at the level of individual examples, then the error rate should be calculated by averaging after this maximization operation. Reporting the bundled attacker error rate provides a lower bound on the true worst-case error rate. The traditional approach of reporting the maximum error rate across attacks can underestimate the true worst-case error rate by an amount approaching 100\% as the number of attacks approaches infinity. Attack bundling can be used with different prioritization schemes to optimize quantities such as error rate on adversarial examples, perturbation size needed to cause misclassification, or failure rate when using a specific confidence threshold.

View on arXiv PDF

Similar