CVAug 28, 2025

A Spatial-Frequency Aware Multi-Scale Fusion Network for Real-Time Deepfake Detection

Libo Lv, Tianyi Wang, Mengxiao Huang, Ruixia Liu, Yinglong Wang

arXiv:2508.20449v16.21 citationsh-index: 11PRCV

Originality Incremental advance

AI Analysis

This work addresses the need for efficient deepfake detection in applications like video conferencing and social media, though it is incremental in improving existing methods.

The paper tackled the problem of real-time deepfake detection by proposing a lightweight network that balances accuracy and efficiency, achieving strong generalization on benchmark datasets.

With the rapid advancement of real-time deepfake generation techniques, forged content is becoming increasingly realistic and widespread across applications like video conferencing and social media. Although state-of-the-art detectors achieve high accuracy on standard benchmarks, their heavy computational cost hinders real-time deployment in practical applications. To address this, we propose the Spatial-Frequency Aware Multi-Scale Fusion Network (SFMFNet), a lightweight yet effective architecture for real-time deepfake detection. We design a spatial-frequency hybrid aware module that jointly leverages spatial textures and frequency artifacts through a gated mechanism, enhancing sensitivity to subtle manipulations. A token-selective cross attention mechanism enables efficient multi-level feature interaction, while a residual-enhanced blur pooling structure helps retain key semantic cues during downsampling. Experiments on several benchmark datasets show that SFMFNet achieves a favorable balance between accuracy and efficiency, with strong generalization and practical value for real-time applications.

View on arXiv PDF

Similar