Game Theory for Adversarial Attacks and Defenses
This addresses security issues for deep learning models in adversarial settings, but it is incremental as it applies existing game-theoretic concepts to defense strategies.
The paper tackles the problem of adversarial attacks on deep neural networks by proposing defense methods using game theory, and the results show that three techniques—random initialization, stochastic activation pruning, and super resolution—effectively improve model robustness.
Adversarial attacks can generate adversarial inputs by applying small but intentionally worst-case perturbations to samples from the dataset, which leads to even state-of-the-art deep neural networks outputting incorrect answers with high confidence. Hence, some adversarial defense techniques are developed to improve the security and robustness of the models and avoid them being attacked. Gradually, a game-like competition between attackers and defenders formed, in which both players would attempt to play their best strategies against each other while maximizing their own payoffs. To solve the game, each player would choose an optimal strategy against the opponent based on the prediction of the opponent's strategy choice. In this work, we are on the defensive side to apply game-theoretic approaches on defending against attacks. We use two randomization methods, random initialization and stochastic activation pruning, to create diversity of networks. Furthermore, we use one denoising technique, super resolution, to improve models' robustness by preprocessing images before attacks. Our experimental results indicate that those three methods can effectively improve the robustness of deep-learning neural networks.