BIAS-ID: A Framework for Analyzing Transformation Biases in AI-Generated Image Detectors
This work provides a crucial framework for researchers and developers to evaluate and understand biases in AI-generated image detectors, which is essential for improving their reliability in real-world applications.
The paper addresses the issue of AI-generated image detectors failing in real-world scenarios due to biases in training data, leading them to rely on spurious correlations. The authors propose BIAS-ID, a framework to analyze and quantify transformation biases, and validate it by evaluating six detectors across two datasets, revealing strong biases in several state-of-the-art methods.
Given the surge of harmful AI-generated imagery online, reliably distinguishing authentic images from generated ones has become an urgent research topic. While many proposed detection methods perform well under controlled settings, they often collapse when tested on real-world data. A potential root cause are subtle biases in the detectors' training data. As a result, detectors may rely on spurious correlations instead of learning true forensic artifacts. While a recent line of work has identified the problem, there is not yet an established protocol to evaluate how biased a detector actually is. In this work, we therefore take a step back: First, we discuss what it means for a detector to be biased, and how this differs from a lack of robustness. Second, we propose BIAS-ID, a transparent framework for analyzing and quantifying the presence of transformation biases in AI-generated image detectors. We validate our framework by performing an evaluation of six detectors across two datasets, revealing that several state-of-the-art detection methods are strongly affected by biases. Our results highlight the importance of bias-aware evaluation for developing reliable AI-generated image detectors.