Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection
This work addresses the challenge of deepfake detection for facial images, which is crucial for security and media integrity, but it appears incremental as it builds on existing prompt learning techniques.
The paper tackles the problem of detecting deepfake facial images by addressing prior knowledge gaps and domain shifts, proposing a knowledge-guided prompt learning method that significantly outperforms state-of-the-art methods on the DeepFakeFaceForensics dataset.
Recent generative models demonstrate impressive performance on synthesizing photographic images, which makes humans hardly to distinguish them from pristine ones, especially on realistic-looking synthetic facial images. Previous works mostly focus on mining discriminative artifacts from vast amount of visual data. However, they usually lack the exploration of prior knowledge and rarely pay attention to the domain shift between training categories (e.g., natural and indoor objects) and testing ones (e.g., fine-grained human facial images), resulting in unsatisfactory detection performance. To address these issues, we propose a novel knowledge-guided prompt learning method for deepfake facial image detection. Specifically, we retrieve forgery-related prompts from large language models as expert knowledge to guide the optimization of learnable prompts. Besides, we elaborate test-time prompt tuning to alleviate the domain shift, achieving significant performance improvement and boosting the application in real-world scenarios. Extensive experiments on DeepFakeFaceForensics dataset show that our proposed approach notably outperforms state-of-the-art methods.