CVAug 28, 2025

A Spatial-Frequency Aware Multi-Scale Fusion Network for Real-Time Deepfake Detection

arXiv:2508.20449v11 citationsh-index: 11PRCV
Originality Incremental advance
AI Analysis

This work addresses the need for efficient deepfake detection in applications like video conferencing and social media, though it is incremental in improving existing methods.

The paper tackled the problem of real-time deepfake detection by proposing a lightweight network that balances accuracy and efficiency, achieving strong generalization on benchmark datasets.

With the rapid advancement of real-time deepfake generation techniques, forged content is becoming increasingly realistic and widespread across applications like video conferencing and social media. Although state-of-the-art detectors achieve high accuracy on standard benchmarks, their heavy computational cost hinders real-time deployment in practical applications. To address this, we propose the Spatial-Frequency Aware Multi-Scale Fusion Network (SFMFNet), a lightweight yet effective architecture for real-time deepfake detection. We design a spatial-frequency hybrid aware module that jointly leverages spatial textures and frequency artifacts through a gated mechanism, enhancing sensitivity to subtle manipulations. A token-selective cross attention mechanism enables efficient multi-level feature interaction, while a residual-enhanced blur pooling structure helps retain key semantic cues during downsampling. Experiments on several benchmark datasets show that SFMFNet achieves a favorable balance between accuracy and efficiency, with strong generalization and practical value for real-time applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes