CV AI LGMar 16, 2018

Semantic Adversarial Examples

arXiv:1804.00499v128.5217 citations

Originality Incremental advance

AI Analysis

This addresses the vulnerability of neural networks to adversarial attacks by proposing a new class of examples that are harder to detect, though it is incremental as it builds on existing adversarial example research.

The paper tackles the problem of adversarial examples in deep neural networks by introducing semantic adversarial examples, which are images perturbed to fool models while maintaining semantic similarity to the original, and achieves an accuracy of 5.7% on VGG16 with CIFAR10.

Deep neural networks are known to be vulnerable to adversarial examples, i.e., images that are maliciously perturbed to fool the model. Generating adversarial examples has been mostly limited to finding small perturbations that maximize the model prediction error. Such images, however, contain artificial perturbations that make them somewhat distinguishable from natural images. This property is used by several defense methods to counter adversarial examples by applying denoising filters or training the model to be robust to small perturbations. In this paper, we introduce a new class of adversarial examples, namely "Semantic Adversarial Examples," as images that are arbitrarily perturbed to fool the model, but in such a way that the modified image semantically represents the same object as the original image. We formulate the problem of generating such images as a constrained optimization problem and develop an adversarial transformation based on the shape bias property of human cognitive system. In our method, we generate adversarial images by first converting the RGB image into the HSV (Hue, Saturation and Value) color space and then randomly shifting the Hue and Saturation components, while keeping the Value component the same. Our experimental results on CIFAR10 dataset show that the accuracy of VGG16 network on adversarial color-shifted images is 5.7%.

View on arXiv PDF

Similar