Investigating the Impact of Pre-processing and Prediction Aggregation on the DeepFake Detection Task
This work addresses the challenge of detecting manipulated media content, which is crucial for combating online misinformation, but it is incremental as it focuses on optimizing existing detection pipelines.
The paper tackled the problem of DeepFake detection by proposing a pre-processing step to improve training data quality and evaluating video-level prediction aggregation methods, resulting in considerable performance improvements and boosted detection efficiency in videos with multiple faces.
Recent advances in content generation technologies (widely known as DeepFakes) along with the online proliferation of manipulated media content render the detection of such manipulations a task of increasing importance. Even though there are many DeepFake detection methods, only a few focus on the impact of dataset preprocessing and the aggregation of frame-level to video-level prediction on model performance. In this paper, we propose a pre-processing step to improve the training data quality and examine its effect on the performance of DeepFake detection. We also propose and evaluate the effect of video-level prediction aggregation approaches. Experimental results show that the proposed pre-processing approach leads to considerable improvements in the performance of detection models, and the proposed prediction aggregation scheme further boosts the detection efficiency in cases where there are multiple faces in a video.