CVJul 6, 2023

MMNet: Multi-Collaboration and Multi-Supervision Network for Sequential Deepfake Detection

arXiv:2307.02733v141 citationsh-index: 54
Originality Incremental advance
AI Analysis

This addresses the challenge of detecting and recovering from forged face images in sequential contexts, which is crucial for combating deceptive media used for social panic or illicit profits, representing a domain-specific advancement in deepfake detection.

The paper tackles the problem of sequential deepfake detection, which identifies forged facial regions in the correct sequence for recovery, by proposing MMNet, a network that handles various spatial scales and sequential permutations without requiring knowledge of the manipulation method, achieving state-of-the-art detection performance and independent recovery performance as demonstrated in experiments on several datasets.

Advanced manipulation techniques have provided criminals with opportunities to make social panic or gain illicit profits through the generation of deceptive media, such as forged face images. In response, various deepfake detection methods have been proposed to assess image authenticity. Sequential deepfake detection, which is an extension of deepfake detection, aims to identify forged facial regions with the correct sequence for recovery. Nonetheless, due to the different combinations of spatial and sequential manipulations, forged face images exhibit substantial discrepancies that severely impact detection performance. Additionally, the recovery of forged images requires knowledge of the manipulation model to implement inverse transformations, which is difficult to ascertain as relevant techniques are often concealed by attackers. To address these issues, we propose Multi-Collaboration and Multi-Supervision Network (MMNet) that handles various spatial scales and sequential permutations in forged face images and achieve recovery without requiring knowledge of the corresponding manipulation method. Furthermore, existing evaluation metrics only consider detection accuracy at a single inferring step, without accounting for the matching degree with ground-truth under continuous multiple steps. To overcome this limitation, we propose a novel evaluation metric called Complete Sequence Matching (CSM), which considers the detection accuracy at multiple inferring steps, reflecting the ability to detect integrally forged sequences. Extensive experiments on several typical datasets demonstrate that MMNet achieves state-of-the-art detection performance and independent recovery performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes