CVOct 28, 2025

A Dual-Branch CNN for Robust Detection of AI-Generated Facial Forgeries

arXiv:2510.24640v11 citationsh-index: 2CSCloud

Originality Incremental advance

AI Analysis

This addresses the threat of AI-generated facial forgeries for AI security and digital media integrity, representing an incremental improvement with a novel hybrid method.

The paper tackles the problem of detecting AI-generated facial forgeries by proposing a dual-branch CNN that uses spatial and frequency cues, achieving strong performance across multiple forgery types and outperforming average human accuracy on the DiFF benchmark.

The rapid advancement of generative AI has enabled the creation of highly realistic forged facial images, posing significant threats to AI security, digital media integrity, and public trust. Face forgery techniques, ranging from face swapping and attribute editing to powerful diffusion-based image synthesis, are increasingly being used for malicious purposes such as misinformation, identity fraud, and defamation. This growing challenge underscores the urgent need for robust and generalizable face forgery detection methods as a critical component of AI security infrastructure. In this work, we propose a novel dual-branch convolutional neural network for face forgery detection that leverages complementary cues from both spatial and frequency domains. The RGB branch captures semantic information, while the frequency branch focuses on high-frequency artifacts that are difficult for generative models to suppress. A channel attention module is introduced to adaptively fuse these heterogeneous features, highlighting the most informative channels for forgery discrimination. To guide the network's learning process, we design a unified loss function, FSC Loss, that combines focal loss, supervised contrastive loss, and a frequency center margin loss to enhance class separability and robustness. We evaluate our model on the DiFF benchmark, which includes forged images generated from four representative methods: text-to-image, image-to-image, face swap, and face edit. Our method achieves strong performance across all categories and outperforms average human accuracy. These results demonstrate the model's effectiveness and its potential contribution to safeguarding AI ecosystems against visual forgery attacks.

View on arXiv PDF

Similar