CVDec 7, 2020

Are DNNs fooled by extremely unrecognizable images?

Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki

arXiv:2012.03843v24.24 citations

Originality Highly original

AI Analysis

This study identifies a new vulnerability in DNNs, demonstrating that they can be fooled by images extremely far from natural image distributions, which is significant for researchers and practitioners concerned with DNN robustness and security.

This paper investigates whether deep neural networks (DNNs) can be fooled by images that are completely unrecognizable to humans, even lacking any local or global features of natural objects. The authors introduce "sparse fooling images" (SFIs), which are single-color images with only a few altered pixels, and prove their existence for linear and nonlinear models, showing that more complex models are more vulnerable. They demonstrate that SFIs successfully fool DNNs by generating features similar to natural images in deeper layers, attributing this vulnerability to the max pooling layer.

Fooling images are a potential threat to deep neural networks (DNNs). These images are not recognizable to humans as natural objects, such as dogs and cats, but are misclassified by DNNs as natural-object classes with high confidence scores. Despite their original design concept, existing fooling images retain some features that are characteristic of the target objects if looked into closely. Hence, DNNs can react to these features. In this paper, we address the question of whether there can be fooling images with no characteristic pattern of natural objects locally or globally. As a minimal case, we introduce single-color images with a few pixels altered, called sparse fooling images (SFIs). We first prove that SFIs always exist under mild conditions for linear and nonlinear models and reveal that complex models are more likely to be vulnerable to SFI attacks. With two SFI generation methods, we demonstrate that in deeper layers, SFIs end up with similar features to those of natural images, and consequently, fool DNNs successfully. Among other layers, we discovered that the max pooling layer causes the vulnerability against SFIs. The defense against SFIs and transferability are also discussed. This study highlights the new vulnerability of DNNs by introducing a novel class of images that distributes extremely far from natural images.

View on arXiv PDF

Similar