CVAIApr 9

Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities

arXiv:2604.0776333.52 citationsh-index: 3
Predicted impact top 17% in CV · last 90 daysOriginality Highly original
AI Analysis

This addresses the problem of catastrophic performance degradation in forensic models for multimodal deepfake attacks, offering a pioneering technical pathway for universal defense, though it appears incremental in advancing existing generalization methods.

The paper tackles the generalization bottleneck in multimodal deepfake detection by shifting from feature fusion to modality generalization, introducing a modality-agnostic framework that extracts shared latent forgery knowledge and achieves significant performance improvements on unseen modalities.

As generative artificial intelligence evolves, deepfake attacks have escalated from single-modality manipulations to complex, multimodal threats. Existing forensic techniques face a severe generalization bottleneck: by relying excessively on superficial, modality-specific artifacts, they neglect the shared latent forgery knowledge hidden beneath variable physical appearances. Consequently, these models suffer catastrophic performance degradation when confronted with unseen "dark modalities." To break this limitation, this paper introduces a paradigm shift that redefines multimodal forensics from conventional "feature fusion" to "modality generalization." We propose the first modality-agnostic forgery (MAF) detection framework. By explicitly decoupling modality-specific styles, MAF precisely extracts the essential, cross-modal latent forgery knowledge. Furthermore, we define two progressive dimensions to quantify model generalization: transferability toward semantically correlated modalities (Weak MAF), and robustness against completely isolated signals of "dark modality" (Strong MAF). To rigorously assess these generalization limits, we introduce the DeepModal-Bench benchmark, which integrates diverse multimodal forgery detection algorithms and adapts state-of-the-art generalized learning methods. This study not only empirically proves the existence of universal forgery traces but also achieves significant performance breakthroughs on unknown modalities via the MAF framework, offering a pioneering technical pathway for universal multimodal defense.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes