CVNov 12, 2024

Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors

Anisha Pal, Julia Kruk, Mansi Phute, Manognya Bhattaram, Diyi Yang, Duen Horng Chau, Judy Hoffman

Georgia Tech

arXiv:2411.07472v114.119 citationsh-index: 48Has CodeNIPS

Originality Synthesis-oriented

AI Analysis

This work addresses the need for reliable detection of AI-generated images to combat misinformation, providing a standardized benchmark for evaluating detector robustness, though it is incremental as it builds on existing detection methods.

The authors tackled the problem of evaluating the robustness of AI-generated image detectors by introducing the SEMI-TRUTHS dataset, which includes over 1.4 million AI-augmented images with metadata, and found that state-of-the-art detectors show varying sensitivities to different perturbations and data distributions.

Text-to-image diffusion models have impactful applications in art, design, and entertainment, yet these technologies also pose significant risks by enabling the creation and dissemination of misinformation. Although recent advancements have produced AI-generated image detectors that claim robustness against various augmentations, their true effectiveness remains uncertain. Do these detectors reliably identify images with different levels of augmentation? Are they biased toward specific scenes or data distributions? To investigate, we introduce SEMI-TRUTHS, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images that feature targeted and localized perturbations produced using diverse augmentation techniques, diffusion models, and data distributions. Each augmented image is accompanied by metadata for standardized and targeted evaluation of detector robustness. Our findings suggest that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used, offering new insights into their performance and limitations. The code for the augmentation and evaluation pipeline is available at https://github.com/J-Kruk/SemiTruths.

View on arXiv PDF Code

Similar