Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization
This addresses a critical security issue for AI systems in applications like vision and speech, though it appears incremental as it builds on existing regularization techniques.
The paper tackles the problem of deep neural networks being vulnerable to adversarial attacks by proposing a method using Jacobian regularization to improve robustness, achieving enhanced results with minimal accuracy loss.
Deep neural networks have lately shown tremendous performance in various applications including vision and speech processing tasks. However, alongside their ability to perform these tasks with such high accuracy, it has been shown that they are highly susceptible to adversarial attacks: a small change in the input would cause the network to err with high confidence. This phenomenon exposes an inherent fault in these networks and their ability to generalize well. For this reason, providing robustness to adversarial attacks is an important challenge in networks training, which has led to extensive research. In this work, we suggest a theoretically inspired novel approach to improve the networks' robustness. Our method applies regularization using the Frobenius norm of the Jacobian of the network, which is applied as post-processing, after regular training has finished. We demonstrate empirically that it leads to enhanced robustness results with a minimal change in the original network's accuracy.