LGCVJul 3, 2025

Fair Deepfake Detectors Can Generalize

arXiv:2507.02645v16 citationsh-index: 13
Originality Highly original
AI Analysis

This work addresses the problem of improving fairness and generalization in deepfake detection for AI security applications, representing a novel theoretical and practical advancement.

The paper tackles the conflicting objectives of generalization and fairness in deepfake detection by uncovering a causal relationship between them, and proposes a framework (DAID) that achieves superior performance in both fairness and generalization across three cross-domain benchmarks.

Deepfake detection models face two critical challenges: generalization to unseen manipulations and demographic fairness among population groups. However, existing approaches often demonstrate that these two objectives are inherently conflicting, revealing a trade-off between them. In this paper, we, for the first time, uncover and formally define a causal relationship between fairness and generalization. Building on the back-door adjustment, we show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions. Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals. Across three cross-domain benchmarks, DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art detectors, validating both its theoretical foundation and practical effectiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes