MFFI: Multi-Dimensional Face Forgery Image Dataset for Real-World Scenarios
This addresses the need for more realistic datasets to improve Deepfake detection for social security, but it is incremental as it builds on existing dataset creation efforts.
The authors tackled the problem of limited diversity in existing Deepfake detection datasets by proposing the MFFI dataset, which includes 50 forgery methods and 1024K image samples to enhance realism in real-world scenarios, and benchmark evaluations showed it outperforms existing datasets in scene complexity, cross-domain generalization, and detection difficulty gradients.
Rapid advances in Artificial Intelligence Generated Content (AIGC) have enabled increasingly sophisticated face forgeries, posing a significant threat to social security. However, current Deepfake detection methods are limited by constraints in existing datasets, which lack the diversity necessary in real-world scenarios. Specifically, these data sets fall short in four key areas: unknown of advanced forgery techniques, variability of facial scenes, richness of real data, and degradation of real-world propagation. To address these challenges, we propose the Multi-dimensional Face Forgery Image (\textbf{MFFI}) dataset, tailored for real-world scenarios. MFFI enhances realism based on four strategic dimensions: 1) Wider Forgery Methods; 2) Varied Facial Scenes; 3) Diversified Authentic Data; 4) Multi-level Degradation Operations. MFFI integrates $50$ different forgery methods and contains $1024K$ image samples. Benchmark evaluations show that MFFI outperforms existing public datasets in terms of scene complexity, cross-domain generalization capability, and detection difficulty gradients. These results validate the technical advance and practical utility of MFFI in simulating real-world conditions. The dataset and additional details are publicly available at {https://github.com/inclusionConf/MFFI}.