LGHCMay 28, 2021

Visualizing Representations of Adversarially Perturbed Inputs

arXiv:2105.14116v1
Originality Synthesis-oriented
AI Analysis

This work addresses the vulnerability of deep learning models to adversarial attacks, providing a tool for researchers to analyze and visualize these effects, though it is incremental in nature.

The paper tackled the problem of understanding how adversarial attacks affect intermediate neural network activations by introducing POP-N, an evaluation metric for visualizing representations, and demonstrated its application on CIFAR-10 with example visualizations.

It has been shown that deep learning models are vulnerable to adversarial attacks. We seek to further understand the consequence of such attacks on the intermediate activations of neural networks. We present an evaluation metric, POP-N, which scores the effectiveness of projecting data to N dimensions under the context of visualizing representations of adversarially perturbed inputs. We conduct experiments on CIFAR-10 to compare the POP-2 score of several dimensionality reduction algorithms across various adversarial attacks. Finally, we utilize the 2D data corresponding to high POP-2 scores to generate example visualizations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes