CV AIDec 26, 2025

Attack-Aware Deepfake Detection under Counter-Forensic Manipulations

Noor Fatima, Hasan Faraz Khan, Muzammil Behzad

arXiv:2512.22303v13.6h-index: 6

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust deepfake detection for forensic applications, though it appears incremental as it builds on existing methods with specific enhancements for attack resilience.

The paper tackles the problem of deepfake detection under counter-forensic manipulations by proposing an attack-aware detector that combines red-team training with test-time defense, achieving near-perfect ranking across attacks, low calibration error, and minimal abstention risk.

This work presents an attack-aware deepfake and image-forensics detector designed for robustness, well-calibrated probabilities, and transparent evidence under realistic deployment conditions. The method combines red-team training with randomized test-time defense in a two-stream architecture, where one stream encodes semantic content using a pretrained backbone and the other extracts forensic residuals, fused via a lightweight residual adapter for classification, while a shallow Feature Pyramid Network style head produces tamper heatmaps under weak supervision. Red-team training applies worst-of-K counter-forensics per batch, including JPEG realign and recompress, resampling warps, denoise-to-regrain operations, seam smoothing, small color and gamma shifts, and social-app transcodes, while test-time defense injects low-cost jitters such as resize and crop phase changes, mild gamma variation, and JPEG phase shifts with aggregated predictions. Heatmaps are guided to concentrate within face regions using face-box masks without strict pixel-level annotations. Evaluation on existing benchmarks, including standard deepfake datasets and a surveillance-style split with low light and heavy compression, reports clean and attacked performance, AUC, worst-case accuracy, reliability, abstention quality, and weak-localization scores. Results demonstrate near-perfect ranking across attacks, low calibration error, minimal abstention risk, and controlled degradation under regrain, establishing a modular, data-efficient, and practically deployable baseline for attack-aware detection with calibrated probabilities and actionable heatmaps.

View on arXiv PDF

Similar