Aliasing is a Driver of Adversarial Attacks
This addresses the problem of adversarial vulnerability in deep learning models for security-critical applications, offering an explainable, non-trained approach, though it is incremental as it builds on existing robust training methods.
The paper tackled the problem of adversarial attacks in neural networks by investigating aliasing as a cause, and found that reducing aliasing through structural modifications increased robustness, with combined anti-aliasing and robust training outperforming solo robust training on L2 attacks without significant losses on L∞ attacks.
Aliasing is a highly important concept in signal processing, as careful consideration of resolution changes is essential in ensuring transmission and processing quality of audio, image, and video. Despite this, up until recently aliasing has received very little consideration in Deep Learning, with all common architectures carelessly sub-sampling without considering aliasing effects. In this work, we investigate the hypothesis that the existence of adversarial perturbations is due in part to aliasing in neural networks. Our ultimate goal is to increase robustness against adversarial attacks using explainable, non-trained, structural changes only, derived from aliasing first principles. Our contributions are the following. First, we establish a sufficient condition for no aliasing for general image transformations. Next, we study sources of aliasing in common neural network layers, and derive simple modifications from first principles to eliminate or reduce it. Lastly, our experimental results show a solid link between anti-aliasing and adversarial attacks. Simply reducing aliasing already results in more robust classifiers, and combining anti-aliasing with robust training out-performs solo robust training on $L_2$ attacks with none or minimal losses in performance on $L_{\infty}$ attacks.