Robust Feature Inference: A Test-time Defense Strategy using Spectral Projections
This addresses the need for efficient test-time defenses in machine learning security, offering a method that integrates easily without increasing inference time, though it appears incremental as it builds on existing robust training procedures.
The paper tackles the problem of improving robustness of deep neural networks to adversarial examples at test-time by proposing Robust Feature Inference (RFI), a strategy that projects models to a robust feature space without extra computation, and results show it consistently improves robustness across multiple datasets and attacks.
Test-time defenses are used to improve the robustness of deep neural networks to adversarial examples during inference. However, existing methods either require an additional trained classifier to detect and correct the adversarial samples, or perform additional complex optimization on the model parameters or the input to adapt to the adversarial samples at test-time, resulting in a significant increase in the inference time compared to the base model. In this work, we propose a novel test-time defense strategy called Robust Feature Inference (RFI) that is easy to integrate with any existing (robust) training procedure without additional test-time computation. Based on the notion of robustness of features that we present, the key idea is to project the trained models to the most robust feature space, thereby reducing the vulnerability to adversarial attacks in non-robust directions. We theoretically characterize the subspace of the eigenspectrum of the feature covariance that is the most robust for a generalized additive model. Our extensive experiments on CIFAR-10, CIFAR-100, tiny ImageNet and ImageNet datasets for several robustness benchmarks, including the state-of-the-art methods in RobustBench show that RFI improves robustness across adaptive and transfer attacks consistently. We also compare RFI with adaptive test-time defenses to demonstrate the effectiveness of our proposed approach.