CVAug 27, 2025

Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning

arXiv:2508.19730v2h-index: 8Proceedings of the 2nd International Workshop on Diffusion of Harmful Content on Online Web
Originality Incremental advance
AI Analysis

This addresses the critical need for robust deepfake detection in real-world media, though it appears incremental as it builds on existing models and techniques.

The authors tackled the problem of poor generalization in deepfake detection by developing a framework that leverages face foundation models and metric learning, achieving strong performance across diverse benchmarks.

The increasing realism and accessibility of deepfakes have raised critical concerns about media authenticity and information integrity. Despite recent advances, deepfake detection models often struggle to generalize beyond their training distributions, particularly when applied to media content found in the wild. In this work, we present a robust video deepfake detection framework with strong generalization that takes advantage of the rich facial representations learned by face foundation models. Our method is built on top of FSFM, a self-supervised model trained on real face data, and is further fine-tuned using an ensemble of deepfake datasets spanning both face-swapping and face-reenactment manipulations. To enhance discriminative power, we incorporate triplet loss variants during training, guiding the model to produce more separable embeddings between real and fake samples. Additionally, we explore attribution-based supervision schemes, where deepfakes are categorized by manipulation type or source dataset, to assess their impact on generalization. Extensive experiments across diverse evaluation benchmarks demonstrate the effectiveness of our approach, especially in challenging real-world scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes