LGCRMLMay 30, 2019

Identifying Classes Susceptible to Adversarial Attacks

arXiv:1905.13284v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of improving adversarial robustness for image classification systems, but it is incremental as it focuses on identifying susceptible classes rather than providing a new defense method.

The paper tackles the problem of identifying which classes in deep learning image classifiers are most vulnerable to adversarial attacks, using distance-based measures to map original to adversarial classes and reduce model randomness, with experiments on MNIST, Fashion MNIST, and CIFAR-10 datasets.

Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create mapping among original classes and adversarial classes that helps to reduce the randomness of a model to a significant amount in an adversarial setting. We analyze the high dimensional geometry among the feature classes and identify the k most susceptible target classes in an adversarial attack. We conduct experiments using MNIST, Fashion MNIST, CIFAR-10 (ImageNet and ResNet-32) datasets. Finally, we evaluate our techniques in order to determine which distance-based measure works best and how the randomness of a model changes with perturbation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes