Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection
This addresses the susceptibility to misuse of synthetic images for generalizable deepfake detection, representing an incremental advance by focusing on generator artifacts rather than detection algorithms.
The paper tackles the problem of deepfake detection by analyzing CNN-based generator architectures, identifying that up-sampling operations produce generalized forgery artifacts, and achieves an 11.6% improvement in performance over existing methods on a dataset with 28 generative models.
Recently, the proliferation of highly realistic synthetic images, facilitated through a variety of GANs and Diffusions, has significantly heightened the susceptibility to misuse. While the primary focus of deepfake detection has traditionally centered on the design of detection algorithms, an investigative inquiry into the generator architectures has remained conspicuously absent in recent years. This paper contributes to this lacuna by rethinking the architectures of CNN-based generators, thereby establishing a generalized representation of synthetic artifacts. Our findings illuminate that the up-sampling operator can, beyond frequency-based artifacts, produce generalized forgery artifacts. In particular, the local interdependence among image pixels caused by upsampling operators is significantly demonstrated in synthetic images generated by GAN or diffusion. Building upon this observation, we introduce the concept of Neighboring Pixel Relationships(NPR) as a means to capture and characterize the generalized structural artifacts stemming from up-sampling operations. A comprehensive analysis is conducted on an open-world dataset, comprising samples generated by \tft{28 distinct generative models}. This analysis culminates in the establishment of a novel state-of-the-art performance, showcasing a remarkable \tft{11.6\%} improvement over existing methods. The code is available at https://github.com/chuangchuangtan/NPR-DeepfakeDetection.