Generalizability of Adversarial Robustness Under Distribution Shifts
This addresses the problem of ensuring reliable deep neural networks in dynamic real-world environments for AI safety and deployment, though it is incremental in extending robustness evaluation beyond same-distribution settings.
The study investigates how adversarial robustness generalizes under distribution shifts, finding that both empirical and certified robustness transfer to unseen domains, with adversarial augmentation boosting robustness generalization in a medical application without harming clean accuracy.
Recent progress in empirical and certified robustness promises to deliver reliable and deployable Deep Neural Networks (DNNs). Despite that success, most existing evaluations of DNN robustness have been done on images sampled from the same distribution on which the model was trained. However, in the real world, DNNs may be deployed in dynamic environments that exhibit significant distribution shifts. In this work, we take a first step towards thoroughly investigating the interplay between empirical and certified adversarial robustness on one hand and domain generalization on another. To do so, we train robust models on multiple domains and evaluate their accuracy and robustness on an unseen domain. We observe that: (1) both empirical and certified robustness generalize to unseen domains, and (2) the level of generalizability does not correlate well with input visual similarity, measured by the FID between source and target domains. We also extend our study to cover a real-world medical application, in which adversarial augmentation significantly boosts the generalization of robustness with minimal effect on clean data accuracy.