CVAIDec 16, 2024

FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

arXiv:2412.12032v317 citationsh-index: 19CVPR
Originality Incremental advance
AI Analysis

This addresses the need for better generalization in face security applications, such as detecting deepfakes and spoofing, but it is incremental as it builds on existing self-supervised learning methods.

The paper tackled the problem of learning robust and transferable facial representations from unlabeled real faces to improve generalization in face security tasks, achieving state-of-the-art performance across 10 public datasets in tasks like deepfake detection and face anti-spoofing.

This work asks: with abundant, unlabeled real faces, how to learn a robust and transferable facial representation that boosts various face security tasks with respect to generalization performance? We make the first attempt and propose a self-supervised pretraining framework to learn fundamental representations of real face images, FSFM, that leverages the synergy between masked image modeling (MIM) and instance discrimination (ID). We explore various facial masking strategies for MIM and present a simple yet powerful CRFR-P masking, which explicitly forces the model to capture meaningful intra-region consistency and challenging inter-region coherency. Furthermore, we devise the ID network that naturally couples with MIM to establish underlying local-to-global correspondence via tailored self-distillation. These three learning objectives, namely 3C, empower encoding both local features and global semantics of real faces. After pretraining, a vanilla ViT serves as a universal vision foundation model for downstream face security tasks: cross-dataset deepfake detection, cross-domain face anti-spoofing, and unseen diffusion facial forgery detection. Extensive experiments on 10 public datasets demonstrate that our model transfers better than supervised pretraining, visual and facial self-supervised learning arts, and even outperforms task-specialized SOTA methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes