LG CROct 5, 2021

Adversarial defenses via a mixture of generators

arXiv:2110.02364v11.6

Originality Incremental advance

AI Analysis

This addresses the security vulnerability of deep learning systems to adversarial attacks, though it is incremental as it builds on existing defense mechanisms.

The paper tackles the problem of defending neural networks against adversarial examples by using a mixture of generators to transform adversarial inputs and recover correct class labels, achieving competitive results on the MNIST dataset without supervision or attack labels.

In spite of the enormous success of neural networks, adversarial examples remain a relatively weakly understood feature of deep learning systems. There is a considerable effort in both building more powerful adversarial attacks and designing methods to counter the effects of adversarial examples. We propose a method to transform the adversarial input data through a mixture of generators in order to recover the correct class obfuscated by the adversarial attack. A canonical set of images is used to generate adversarial examples through potentially multiple attacks. Such transformed images are processed by a set of generators, which are trained adversarially as a whole to compete in inverting the initial transformations. To our knowledge, this is the first use of a mixture-based adversarially trained system as a defense mechanism. We show that it is possible to train such a system without supervision, simultaneously on multiple adversarial attacks. Our system is able to recover class information for previously-unseen examples with neither attack nor data labels on the MNIST dataset. The results demonstrate that this multi-attack approach is competitive with adversarial defenses tested in single-attack settings.

View on arXiv PDF

Similar