CV CLDec 12, 2024

Human vs. AI: A Novel Benchmark and a Comparative Study on the Detection of Generated Images and the Impact of Prompts

arXiv:2412.09715v117.819 citationsh-index: 1Has CodeCOLING Workshops

Originality Synthesis-oriented

AI Analysis

This addresses the threat of disinformation from AI-generated images, but it is incremental as it focuses on prompt detail rather than a new detection method.

The study tackled the problem of detecting AI-generated images by investigating how the level of detail in prompts affects detectability, finding that both humans and AI detectors perform significantly better on images from longer prompts, with a user study of 200 participants showing this effect.

With the advent of publicly available AI-based text-to-image systems, the process of creating photorealistic but fully synthetic images has been largely democratized. This can pose a threat to the public through a simplified spread of disinformation. Machine detectors and human media expertise can help to differentiate between AI-generated (fake) and real images and counteract this danger. Although AI generation models are highly prompt-dependent, the impact of the prompt on the fake detection performance has rarely been investigated yet. This work therefore examines the influence of the prompt's level of detail on the detectability of fake images, both with an AI detector and in a user study. For this purpose, we create a novel dataset, COCOXGEN, which consists of real photos from the COCO dataset as well as images generated with SDXL and Fooocus using prompts of two standardized lengths. Our user study with 200 participants shows that images generated with longer, more detailed prompts are detected significantly more easily than those generated with short prompts. Similarly, an AI-based detection model achieves better performance on images generated with longer prompts. However, humans and AI models seem to pay attention to different details, as we show in a heat map analysis.

View on arXiv PDF Code

Similar