CVJan 31, 2022

Imperceptible and Multi-channel Backdoor Attack against Deep Neural Networks

Mingfu Xue, Shifeng Ni, Yinghao Wu, Yushu Zhang, Jian Wang, Weiqiang Liu

arXiv:2201.13164v18.822 citations

Originality Incremental advance

AI Analysis

This addresses a security vulnerability in deep neural networks for AI practitioners and researchers, offering a more covert and flexible attack method, though it is incremental as it builds on existing backdoor attack concepts.

The paper tackles the problem of making backdoor attacks against deep neural networks more stealthy and versatile by proposing an imperceptible and multi-channel method using Discrete Cosine Transform steganography, achieving average attack success rates of 93.95% on CIFAR-10 and 91.55% on TinyImageNet for N-to-N attacks, and 90.22% and 89.53% for N-to-One attacks, while maintaining classification accuracy and robustness against defenses like Neural Cleanse.

Recent researches demonstrate that Deep Neural Networks (DNN) models are vulnerable to backdoor attacks. The backdoored DNN model will behave maliciously when images containing backdoor triggers arrive. To date, existing backdoor attacks are single-trigger and single-target attacks, and the triggers of most existing backdoor attacks are obvious thus are easy to be detected or noticed. In this paper, we propose a novel imperceptible and multi-channel backdoor attack against Deep Neural Networks by exploiting Discrete Cosine Transform (DCT) steganography. Based on the proposed backdoor attack method, we implement two variants of backdoor attacks, i.e., N-to-N backdoor attack and N-to-One backdoor attack. Specifically, for a colored image, we utilize DCT steganography to construct the trigger on different channels of the image. As a result, the trigger is stealthy and natural. Based on the proposed method, we implement multi-target and multi-trigger backdoor attacks. Experimental results demonstrate that the average attack success rate of the N-to-N backdoor attack is 93.95% on CIFAR-10 dataset and 91.55% on TinyImageNet dataset, respectively. The average attack success rate of N-to-One attack is 90.22% and 89.53% on CIFAR-10 and TinyImageNet datasets, respectively. Meanwhile, the proposed backdoor attack does not affect the classification accuracy of the DNN model. Moreover, the proposed attack is demonstrated to be robust to the state-of-the-art backdoor defense (Neural Cleanse).

View on arXiv PDF

Similar