CVSep 13, 2019

Towards Generalizable Deepfake Detection with Locality-aware AutoEncoder

arXiv:1909.05999v242 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of detecting deepfakes that generalize to unseen manipulations, which is crucial for mitigating societal risks from realistic fake media, though it is an incremental advance in method design.

The paper tackles the problem of deepfake detection generalization by proposing a Locality-Aware AutoEncoder (LAE) that learns intrinsic representations from forgery regions, achieving state-of-the-art improvements of 6.52%, 12.03%, and 3.08% in generalization accuracy on three unseen manipulation tasks.

With advancements of deep learning techniques, it is now possible to generate super-realistic images and videos, i.e., deepfakes. These deepfakes could reach mass audience and result in adverse impacts on our society. Although lots of efforts have been devoted to detect deepfakes, their performance drops significantly on previously unseen but related manipulations and the detection generalization capability remains a problem. Motivated by the fine-grained nature and spatial locality characteristics of deepfakes, we propose Locality-Aware AutoEncoder (LAE) to bridge the generalization gap. In the training process, we use a pixel-wise mask to regularize local interpretation of LAE to enforce the model to learn intrinsic representation from the forgery region, instead of capturing artifacts in the training set and learning superficial correlations to perform detection. We further propose an active learning framework to select the challenging candidates for labeling, which requires human masks for less than 3% of the training data, dramatically reducing the annotation efforts to regularize interpretations. Experimental results on three deepfake detection tasks indicate that LAE could focus on the forgery regions to make decisions. The analysis further shows that LAE outperforms the state-of-the-arts by 6.52%, 12.03%, and 3.08% respectively on three deepfake detection tasks in terms of generalization accuracy on previously unseen manipulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes