When Handcrafted Features and Deep Features Meet Mismatched Training and Test Sets for Deepfake Detection
This addresses the challenge of creating a universal deepfake detection system for combating synthetic media threats, but it is incremental as it evaluates existing methods on known mismatches.
The paper tackled deepfake detection by comparing handcrafted and deep features, finding that Xception achieved over 99% accuracy on matched datasets but performance dropped significantly with mismatched training and test sets.
The accelerated growth in synthetic visual media generation and manipulation has now reached the point of raising significant concerns and posing enormous intimidations towards society. There is an imperative need for automatic detection networks towards false digital content and avoid the spread of dangerous artificial information to contend with this threat. In this paper, we utilize and compare two kinds of handcrafted features(SIFT and HoG) and two kinds of deep features(Xception and CNN+RNN) for the deepfake detection task. We also check the performance of these features when there are mismatches between training sets and test sets. Evaluation is performed on the famous FaceForensics++ dataset, which contains four sub-datasets, Deepfakes, Face2Face, FaceSwap and NeuralTextures. The best results are from Xception, where the accuracy could surpass over 99\% when the training and test set are both from the same sub-dataset. In comparison, the results drop dramatically when the training set mismatches the test set. This phenomenon reveals the challenge of creating a universal deepfake detection system.