MS-GAGA: Metric-Selective Guided Adversarial Generation Attack
This addresses the challenge of improving adversarial attacks for deepfake detection, which is important for security applications, but it is incremental as it builds on existing methods like PGD.
The paper tackles the problem of generating adversarial examples that are both transferable and visually imperceptible to fool deepfake detectors in black-box settings, achieving up to 27% higher misclassification rates on unseen detectors compared to state-of-the-art attacks.
We present MS-GAGA (Metric-Selective Guided Adversarial Generation Attack), a two-stage framework for crafting transferable and visually imperceptible adversarial examples against deepfake detectors in black-box settings. In Stage 1, a dual-stream attack module generates adversarial candidates: MNTD-PGD applies enhanced gradient calculations optimized for small perturbation budgets, while SG-PGD focuses perturbations on visually salient regions. This complementary design expands the adversarial search space and improves transferability across unseen models. In Stage 2, a metric-aware selection module evaluates candidates based on both their success against black-box models and their structural similarity (SSIM) to the original image. By jointly optimizing transferability and imperceptibility, MS-GAGA achieves up to 27% higher misclassification rates on unseen detectors compared to state-of-the-art attacks.