CV CRApr 26, 2022

Restricted Black-box Adversarial Attack Against DeepFake Face Swapping

Junhao Dong, Yuan Wang, Jianhuang Lai, Xiaohua Xie

arXiv:2204.12347v116.391 citationsh-index: 28

Originality Incremental advance

AI Analysis

This addresses security threats from DeepFake fraud in online media, offering a more practical defense compared to prior query-intensive methods, though it is incremental in improving adversarial transferability.

The paper tackles the problem of DeepFake face swapping by introducing a practical adversarial attack that requires no queries to black-box models, using a substitute model and TCA-GAN to generate transferable perturbations, resulting in significantly reduced visual quality of DeepFake images, making them easier to detect.

DeepFake face swapping presents a significant threat to online security and social media, which can replace the source face in an arbitrary photo/video with the target face of an entirely different person. In order to prevent this fraud, some researchers have begun to study the adversarial methods against DeepFake or face manipulation. However, existing works focus on the white-box setting or the black-box setting driven by abundant queries, which severely limits the practical application of these methods. To tackle this problem, we introduce a practical adversarial attack that does not require any queries to the facial image forgery model. Our method is built on a substitute model persuing for face reconstruction and then transfers adversarial examples from the substitute model directly to inaccessible black-box DeepFake models. Specially, we propose the Transferable Cycle Adversary Generative Adversarial Network (TCA-GAN) to construct the adversarial perturbation for disrupting unknown DeepFake systems. We also present a novel post-regularization module for enhancing the transferability of generated adversarial examples. To comprehensively measure the effectiveness of our approaches, we construct a challenging benchmark of DeepFake adversarial attacks for future development. Extensive experiments impressively show that the proposed adversarial attack method makes the visual quality of DeepFake face images plummet so that they are easier to be detected by humans and algorithms. Moreover, we demonstrate that the proposed algorithm can be generalized to offer face image protection against various face translation methods.

View on arXiv PDF

Similar