LG CR CV MLDec 1, 2019

A Method for Computing Class-wise Universal Adversarial Perturbations

Tejus Gupta, Abhishek Sinha, Nupur Kumari, Mayank Singh, Balaji Krishnamurthy

arXiv:1912.00466v19.112 citations

Originality Incremental advance

AI Analysis

This addresses the vulnerability of deep neural networks to adversarial attacks, which is a critical security issue for AI systems, though it is incremental as it builds on prior work on universal perturbations.

The paper tackles the problem of computing class-specific universal adversarial perturbations for deep neural networks, resulting in a method that achieves 34% to 51% fooling rates on ImageNet models and transfers across models without requiring training data or hyper-parameters.

We present an algorithm for computing class-specific universal adversarial perturbations for deep neural networks. Such perturbations can induce misclassification in a large fraction of images of a specific class. Unlike previous methods that use iterative optimization for computing a universal perturbation, the proposed method employs a perturbation that is a linear function of weights of the neural network and hence can be computed much faster. The method does not require any training data and has no hyper-parameters. The attack obtains 34% to 51% fooling rate on state-of-the-art deep neural networks on ImageNet and transfers across models. We also study the characteristics of the decision boundaries learned by standard and adversarially trained models to understand the universal adversarial perturbations.

View on arXiv PDF

Similar