FIND: A Simple yet Effective Baseline for Diffusion-Generated Image Detection
This addresses the critical detection challenge posed by realistic diffusion-generated images, offering a practical and efficient solution for content verification, though it is incremental as it builds on prior work using reconstruction errors.
The paper tackles the problem of detecting images generated by diffusion models by proposing FIND, a method that uses a simple binary classifier with noise augmentation to distinguish real from synthetic images based on Gaussian distribution fitting, achieving an 11.7% performance improvement on the GenImage benchmark and running 126x faster than existing methods.
The remarkable realism of images generated by diffusion models poses critical detection challenges. Current methods utilize reconstruction error as a discriminative feature, exploiting the observation that real images exhibit higher reconstruction errors when processed through diffusion models. However, these approaches require costly reconstruction computations and depend on specific diffusion models, making their performance highly model-dependent. We identify a fundamental difference: real images are more difficult to fit with Gaussian distributions compared to synthetic ones. In this paper, we propose Forgery Identification via Noise Disturbance (FIND), a novel method that requires only a simple binary classifier. It eliminates reconstruction by directly targeting the core distributional difference between real and synthetic images. Our key operation is to add Gaussian noise to real images during training and label these noisy versions as synthetic. This step allows the classifier to focus on the statistical patterns that distinguish real from synthetic images. We theoretically prove that the noise-augmented real images resemble diffusion-generated images in their ease of Gaussian fitting. Furthermore, simply by adding noise, they still retain visual similarity to the original images, highlighting the most discriminative distribution-related features. The proposed FIND improves performance by 11.7% on the GenImage benchmark while running 126x faster than existing methods. By removing the need for auxiliary diffusion models and reconstruction, it offers a practical, efficient, and generalizable way to detect diffusion-generated content.