LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection
This work addresses the challenge of detecting unseen deepfake manipulations for security and media verification, representing an incremental improvement over existing methods.
The paper tackles the problem of poor generalization in high-quality deepfake detection by introducing LAA-Net, which uses an explicit attention mechanism and an Enhanced Feature Pyramid Network to focus on artifact-prone regions, achieving superior AUC and AP scores on multiple benchmarks.
This paper introduces a novel approach for high-quality deepfake detection called Localized Artifact Attention Network (LAA-Net). Existing methods for high-quality deepfake detection are mainly based on a supervised binary classifier coupled with an implicit attention mechanism. As a result, they do not generalize well to unseen manipulations. To handle this issue, two main contributions are made. First, an explicit attention mechanism within a multi-task learning framework is proposed. By combining heatmap-based and self-consistency attention strategies, LAA-Net is forced to focus on a few small artifact-prone vulnerable regions. Second, an Enhanced Feature Pyramid Network (E-FPN) is proposed as a simple and effective mechanism for spreading discriminative low-level features into the final feature output, with the advantage of limiting redundancy. Experiments performed on several benchmarks show the superiority of our approach in terms of Area Under the Curve (AUC) and Average Precision (AP). The code is available at https://github.com/10Ring/LAA-Net.