CRApr 15, 2021
Robust Backdoor Attacks against Deep Neural Networks in Real Physical WorldMingfu Xue, Can He, Shichang Sun et al.
Deep neural networks (DNN) have been widely deployed in various applications. However, many researches indicated that DNN is vulnerable to backdoor attacks. The attacker can create a hidden backdoor in target DNN model, and trigger the malicious behaviors by submitting specific backdoor instance. However, almost all the existing backdoor works focused on the digital domain, while few studies investigate the backdoor attacks in real physical world. Restricted to a variety of physical constraints, the performance of backdoor attacks in the real physical world will be severely degraded. In this paper, we propose a robust physical backdoor attack method, PTB (physical transformations for backdoors), to implement the backdoor attacks against deep learning models in the real physical world. Specifically, in the training phase, we perform a series of physical transformations on these injected backdoor instances at each round of model training, so as to simulate various transformations that a backdoor may experience in real world, thus improves its physical robustness. Experimental results on the state-of-the-art face recognition model show that, compared with the backdoor methods that without PTB, the proposed attack method can significantly improve the performance of backdoor attacks in real physical world. Under various complex physical conditions, by injecting only a very small ratio (0.5%) of backdoor instances, the attack success rate of physical backdoor attacks with the PTB method on VGGFace is 82%, while the attack success rate of backdoor attacks without the proposed PTB method is lower than 11%. Meanwhile, the normal performance of the target DNN model has not been affected.
CRMar 2, 2021
ActiveGuard: An Active DNN IP Protection Technique via Adversarial ExamplesMingfu Xue, Shichang Sun, Can He et al.
The training of Deep Neural Networks (DNN) is costly, thus DNN can be considered as the intellectual properties (IP) of model owners. To date, most of the existing protection works focus on verifying the ownership after the DNN model is stolen, which cannot resist piracy in advance. To this end, we propose an active DNN IP protection method based on adversarial examples against DNN piracy, named ActiveGuard. ActiveGuard aims to achieve authorization control and users' fingerprints management through adversarial examples, and can provide ownership verification. Specifically, ActiveGuard exploits the elaborate adversarial examples as users' fingerprints to distinguish authorized users from unauthorized users. Legitimate users can enter fingerprints into DNN for identity authentication and authorized usage, while unauthorized users will obtain poor model performance due to an additional control layer. In addition, ActiveGuard enables the model owner to embed a watermark into the weights of DNN. When the DNN is illegally pirated, the model owner can extract the embedded watermark and perform ownership verification. Experimental results show that, for authorized users, the test accuracy of LeNet-5 and Wide Residual Network (WRN) models are 99.15% and 91.46%, respectively, while for unauthorized users, the test accuracy of the two DNNs are only 8.92% (LeNet-5) and 10% (WRN), respectively. Besides, each authorized user can pass the fingerprint authentication with a high success rate (up to 100%). For ownership verification, the embedded watermark can be successfully extracted, while the normal performance of the DNN model will not be affected. Further, ActiveGuard is demonstrated to be robust against fingerprint forgery attack, model fine-tuning attack and pruning attack.
CVNov 27, 2020
3D Invisible CloakMingfu Xue, Can He, Zhiyu Wu et al.
In this paper, we propose a novel physical stealth attack against the person detectors in real world. The proposed method generates an adversarial patch, and prints it on real clothes to make a three dimensional (3D) invisible cloak. Anyone wearing the cloak can evade the detection of person detectors and achieve stealth. We consider the impacts of those 3D physical constraints (i.e., radian, wrinkle, occlusion, angle, etc.) on person stealth attacks, and propose 3D transformations to generate 3D invisible cloak. We launch the person stealth attacks in 3D physical space instead of 2D plane by printing the adversarial patches on real clothes under challenging and complex 3D physical scenarios. The conventional and 3D transformations are performed on the patch during its optimization process. Further, we study how to generate the optimal 3D invisible cloak. Specifically, we explore how to choose input images with specific shapes and colors to generate the optimal 3D invisible cloak. Besides, after successfully making the object detector misjudge the person as other objects, we explore how to make a person completely disappeared, i.e., the person will not be detected as any objects. Finally, we present a systematic evaluation framework to methodically evaluate the performance of the proposed attack in digital domain and physical world. Experimental results in various indoor and outdoor physical scenarios show that, the proposed person stealth attack method is robust and effective even under those complex and challenging physical conditions, such as the cloak is wrinkled, obscured, curved, and from different angles. The attack success rate in digital domain (Inria data set) is 86.56%, while the static and dynamic stealth attack performance in physical world is 100% and 77%, respectively, which are significantly better than existing works.
CRNov 27, 2020
Use the Spear as a Shield: A Novel Adversarial Example based Privacy-Preserving Technique against Membership Inference AttacksMingfu Xue, Chengxiang Yuan, Can He et al.
Recently, the membership inference attack poses a serious threat to the privacy of confidential training data of machine learning models. This paper proposes a novel adversarial example based privacy-preserving technique (AEPPT), which adds the crafted adversarial perturbations to the prediction of the target model to mislead the adversary's membership inference model. The added adversarial perturbations do not affect the accuracy of target model, but can prevent the adversary from inferring whether a specific data is in the training set of the target model. Since AEPPT only modifies the original output of the target model, the proposed method is general and does not require modifying or retraining the target model. Experimental results show that the proposed method can reduce the inference accuracy and precision of the membership inference model to 50%, which is close to a random guess. Further, for those adaptive attacks where the adversary knows the defense mechanism, the proposed AEPPT is also demonstrated to be effective. Compared with the state-of-the-art defense methods, the proposed defense can significantly degrade the accuracy and precision of membership inference attacks to 50% (i.e., the same as a random guess) while the performance and utility of the target model will not be affected.
CVNov 27, 2020
NaturalAE: Natural and Robust Physical Adversarial Examples for Object DetectorsMingfu Xue, Chengxiang Yuan, Can He et al.
In this paper, we propose a natural and robust physical adversarial example attack method targeting object detectors under real-world conditions. The generated adversarial examples are robust to various physical constraints and visually look similar to the original images, thus these adversarial examples are natural to humans and will not cause any suspicions. First, to ensure the robustness of the adversarial examples in real-world conditions, the proposed method exploits different image transformation functions, to simulate various physical changes during the iterative optimization of the adversarial examples generation. Second, to construct natural adversarial examples, the proposed method uses an adaptive mask to constrain the area and intensities of the added perturbations, and utilizes the real-world perturbation score (RPS) to make the perturbations be similar to those real noises in physical world. Compared with existing studies, our generated adversarial examples can achieve a high success rate with less conspicuous perturbations. Experimental results demonstrate that, the generated adversarial examples are robust under various indoor and outdoor physical conditions, including different distances, angles, illuminations, and photographing. Specifically, the attack success rate of generated adversarial examples indoors and outdoors is high up to 73.33% and 82.22%, respectively. Meanwhile, the proposed method ensures the naturalness of the generated adversarial example, and the size of added perturbations is much smaller than the perturbations in the existing works. Further, the proposed physical adversarial attack method can be transferred from the white-box models to other object detection models.
CVNov 27, 2020
SocialGuard: An Adversarial Example Based Privacy-Preserving Technique for Social ImagesMingfu Xue, Shichang Sun, Zhiyu Wu et al.
The popularity of various social platforms has prompted more people to share their routine photos online. However, undesirable privacy leakages occur due to such online photo sharing behaviors. Advanced deep neural network (DNN) based object detectors can easily steal users' personal information exposed in shared photos. In this paper, we propose a novel adversarial example based privacy-preserving technique for social images against object detectors based privacy stealing. Specifically, we develop an Object Disappearance Algorithm to craft two kinds of adversarial social images. One can hide all objects in the social images from being detected by an object detector, and the other can make the customized sensitive objects be incorrectly classified by the object detector. The Object Disappearance Algorithm constructs perturbation on a clean social image. After being injected with the perturbation, the social image can easily fool the object detector, while its visual quality will not be degraded. We use two metrics, privacy-preserving success rate and privacy leakage rate, to evaluate the effectiveness of the proposed method. Experimental results show that, the proposed method can effectively protect the privacy of social images. The privacy-preserving success rates of the proposed method on MS-COCO and PASCAL VOC 2007 datasets are high up to 96.1% and 99.3%, respectively, and the privacy leakage rates on these two datasets are as low as 0.57% and 0.07%, respectively. In addition, compared with existing image processing methods (low brightness, noise, blur, mosaic and JPEG compression), the proposed method can achieve much better performance in privacy protection and image visual quality maintenance.