Improving Video Deepfake Detection: A DCT-Based Approach with Patch-Level Analysis
This work addresses the problem of detecting manipulated videos for security and media integrity, but it is incremental as it builds on existing DCT and patch-level methods.
The paper tackled deepfake detection in videos by analyzing specific facial regions using DCT-based features, finding that eye and mouth regions are most discriminative for classification.
A new algorithm for the detection of deepfakes in digital videos is presented. The I-frames were extracted in order to provide faster computation and analysis than approaches described in the literature. To identify the discriminating regions within individual video frames, the entire frame, background, face, eyes, nose, mouth, and face frame were analyzed separately. From the Discrete Cosine Transform (DCT), the Beta components were extracted from the AC coefficients and used as input to standard classifiers. Experimental results show that the eye and mouth regions are those most discriminative and able to determine the nature of the video under analysis.