Towards Generalizable Deepfake Detection with Locality-aware AutoEncoder
This addresses the challenge of detecting deepfakes that generalize to unseen manipulations, which is crucial for mitigating societal risks from realistic fake media, though it is an incremental advance in method design.
The paper tackles the problem of deepfake detection generalization by proposing a Locality-Aware AutoEncoder (LAE) that learns intrinsic representations from forgery regions, achieving state-of-the-art improvements of 6.52%, 12.03%, and 3.08% in generalization accuracy on three unseen manipulation tasks.
With advancements of deep learning techniques, it is now possible to generate super-realistic images and videos, i.e., deepfakes. These deepfakes could reach mass audience and result in adverse impacts on our society. Although lots of efforts have been devoted to detect deepfakes, their performance drops significantly on previously unseen but related manipulations and the detection generalization capability remains a problem. Motivated by the fine-grained nature and spatial locality characteristics of deepfakes, we propose Locality-Aware AutoEncoder (LAE) to bridge the generalization gap. In the training process, we use a pixel-wise mask to regularize local interpretation of LAE to enforce the model to learn intrinsic representation from the forgery region, instead of capturing artifacts in the training set and learning superficial correlations to perform detection. We further propose an active learning framework to select the challenging candidates for labeling, which requires human masks for less than 3% of the training data, dramatically reducing the annotation efforts to regularize interpretations. Experimental results on three deepfake detection tasks indicate that LAE could focus on the forgery regions to make decisions. The analysis further shows that LAE outperforms the state-of-the-arts by 6.52%, 12.03%, and 3.08% respectively on three deepfake detection tasks in terms of generalization accuracy on previously unseen manipulations.