Improving Adversarial Robustness via Unlabeled Out-of-Domain Data
This addresses the challenge of enhancing model robustness against adversarial attacks for machine learning practitioners, particularly in scenarios with limited labeled data, by proposing a novel data augmentation strategy.
The paper tackles the problem of improving adversarial robustness in machine learning models by leveraging unlabeled out-of-domain data, showing that this approach can bridge the sample complexity gap between standard and robust classification and achieve better robustness than using same-domain data, with experiments on CIFAR-10 and SVHN demonstrating substantial improvements against adversarial attacks.
Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data. We demonstrate that for broad classes of distributions and classifiers, there exists a sample complexity gap between standard and robust classification. We quantify to what degree this gap can be bridged via leveraging unlabeled samples from a shifted domain by providing both upper and lower bounds. Moreover, we show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data. We also investigate how to leverage out-of-domain data when some structural information, such as sparsity, is shared between labeled and unlabeled domains. Experimentally, we augment two object recognition datasets (CIFAR-10 and SVHN) with easy to obtain and unlabeled out-of-domain data and demonstrate substantial improvement in the model's robustness against $\ell_\infty$ adversarial attacks on the original domain.