CYApr 13

BiasIG: Benchmarking Multi-dimensional Social Biases in Text-to-Image Models

Hanjun Luo, Zhimu Huang, Haoyu Huang, Ziye Deng, Ruizhe Chen, Xinfeng Li, Zuozhu Liu, Hanan Salam

arXiv:2604.1193499.4h-index: 14Has Code

AI Analysis

For researchers and practitioners in AI fairness, BiasIG provides a fine-grained, taxonomy-driven diagnostic tool to evaluate multi-dimensional biases in T2I models, addressing the lack of comprehensive benchmarks.

BiasIG is a benchmark that quantifies social biases in text-to-image models across 47,040 prompts and 4 dimensions, revealing that debiasing methods often cause unintended confounding effects and tend toward discrimination rather than ignorance.

Text-to-Image (T2I) generative models have revolutionized content creation, yet they inherently risk amplifying societal biases. While sociological research provides systematic classifications of bias, existing T2I benchmarks largely conflate these nuances or focus narrowly on occupational stereotypes, leaving the multi-dimensional nature of generative bias inadequately measured. In this paper, we introduce BiasIG, a unified benchmark that quantifies social biases across a curated dataset of 47,040 prompts. Grounded in sociological and machine ethics frameworks, BiasIG disentangles biases across 4 dimensions to enable fine-grained diagnosis. To facilitate scalable and reliable evaluation, we propose a fully automated pipeline powered by a fine-tuned multi-modal large language model, achieving high alignment accuracy comparable to human experts. Extensive experiments on 8 T2I models and 3 debiasing methods not only validate BiasIG as a robust diagnostic tool, but also reveal critical insights: interventions on protected attributes often trigger unintended confounding effects on unrelated demographics, and debiasing methods exhibit a persistent tendency toward discrimination rather than mere ignorance. Our work advocates for a precise, taxonomy-driven approach to fairness in AIGC, providing a theoretical framework for using BiasIG's metrics as feedback signals in future closed-loop mitigation. The benchmark is openly available at https://github.com/Astarojth/BiasIG.

View on arXiv PDF Code

Similar