CVJul 28, 2025

AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations

Zhixi Cai, Kartik Kuckreja, Shreya Ghosh, Akanksha Chuchra, Muhammad Haris Khan, Usman Tariq, Tom Gedeon, Abhinav Dhall

arXiv:2507.20579v110 citationsh-index: 16MM

Originality Synthesis-oriented

AI Analysis

This dataset addresses the problem of deepfake detection for researchers by providing a comprehensive benchmark, though it is incremental as an extension of an existing dataset.

The authors tackled the need for diverse datasets to detect deepfakes by proposing AV-Deepfake1M++, a large-scale benchmark with 2 million video clips that includes varied manipulation strategies and real-world perturbations, and they benchmarked it using state-of-the-art methods.

The rapid surge of text-to-speech and face-voice reenactment models makes video fabrication easier and highly realistic. To encounter this problem, we require datasets that rich in type of generation methods and perturbation strategy which is usually common for online videos. To this end, we propose AV-Deepfake1M++, an extension of the AV-Deepfake1M having 2 million video clips with diversified manipulation strategy and audio-visual perturbation. This paper includes the description of data generation strategies along with benchmarking of AV-Deepfake1M++ using state-of-the-art methods. We believe that this dataset will play a pivotal role in facilitating research in Deepfake domain. Based on this dataset, we host the 2025 1M-Deepfakes Detection Challenge. The challenge details, dataset and evaluation scripts are available online under a research-only license at https://deepfakes1m.github.io/2025.

View on arXiv PDF

Similar