DDNet: A Dual-Stream Graph Learning and Disentanglement Framework for Temporal Forgery Localization

Boyang Zhao, Xin Liao, Jiaxin Chen, Xiaoshuai Wu, Yufeng Wu

arXiv:2601.01784v11.5h-index: 6

Originality Incremental advance

AI Analysis

This addresses the challenge of precisely localizing tampered segments in videos for applications like media forensics, though it appears incremental as it builds on existing methods by enhancing global anomaly capture.

The paper tackles the problem of temporal forgery localization in videos by proposing DDNet, a dual-stream graph learning and disentanglement framework, which outperforms state-of-the-art methods by approximately 9% in AP@0.95 and shows improved cross-domain robustness.

The rapid evolution of AIGC technology enables misleading viewers by tampering mere small segments within a video, rendering video-level detection inaccurate and unpersuasive. Consequently, temporal forgery localization (TFL), which aims to precisely pinpoint tampered segments, becomes critical. However, existing methods are often constrained by \emph{local view}, failing to capture global anomalies. To address this, we propose a \underline{d}ual-stream graph learning and \underline{d}isentanglement framework for temporal forgery localization (DDNet). By coordinating a \emph{Temporal Distance Stream} for local artifacts and a \emph{Semantic Content Stream} for long-range connections, DDNet prevents global cues from being drowned out by local smoothness. Furthermore, we introduce Trace Disentanglement and Adaptation (TDA) to isolate generic forgery fingerprints, alongside Cross-Level Feature Embedding (CLFE) to construct a robust feature foundation via deep fusion of hierarchical features. Experiments on ForgeryNet and TVIL benchmarks demonstrate that our method outperforms state-of-the-art approaches by approximately 9\% in AP@0.95, with significant improvements in cross-domain robustness.

View on arXiv PDF

Similar