Bridging the Gap: A Framework for Real-World Video Deepfake Detection via Social Network Compression Emulation
This addresses the challenge of generalizing deepfake detection to compressed video content on social networks, offering a scalable solution for real-world deployment, though it is incremental as it builds on existing detection methods.
The paper tackles the problem of deepfake detectors failing in real-world scenarios due to social network compression by proposing a framework that emulates platform-specific video degradation patterns, enabling detectors fine-tuned on emulated data to achieve comparable performance to those trained on actual shared media, with experiments on FaceForensics++ showing close matching of degradation patterns.
The growing presence of AI-generated videos on social networks poses new challenges for deepfake detection, as detectors trained under controlled conditions often fail to generalize to real-world scenarios. A key factor behind this gap is the aggressive, proprietary compression applied by platforms like YouTube and Facebook, which launder low-level forensic cues. However, replicating these transformations at scale is difficult due to API limitations and data-sharing constraints. For these reasons, we propose a first framework that emulates the video sharing pipelines of social networks by estimating compression and resizing parameters from a small set of uploaded videos. These parameters enable a local emulator capable of reproducing platform-specific artifacts on large datasets without direct API access. Experiments on FaceForensics++ videos shared via social networks demonstrate that our emulated data closely matches the degradation patterns of real uploads. Furthermore, detectors fine-tuned on emulated videos achieve comparable performance to those trained on actual shared media. Our approach offers a scalable and practical solution for bridging the gap between lab-based training and real-world deployment of deepfake detectors, particularly in the underexplored domain of compressed video content.