CVFeb 22, 2024

Debiasing Text-to-Image Diffusion Models

arXiv:2402.14577v112 citationsh-index: 15Has CodeProceedings of the 1st ACM Multimedia Workshop on Multi-modal Misinformation Governance in the Era of Foundation Models
Originality Incremental advance
AI Analysis

This addresses bias concerns in generative AI for users of text-to-image systems, but it is incremental as it builds on existing debiasing efforts.

The paper tackles social bias in text-to-image diffusion models by proposing an iterative distribution alignment method, which shows efficiency and fast convergence in resolving bias issues.

Learning-based Text-to-Image (TTI) models like Stable Diffusion have revolutionized the way visual content is generated in various domains. However, recent research has shown that nonnegligible social bias exists in current state-of-the-art TTI systems, which raises important concerns. In this work, we target resolving the social bias in TTI diffusion models. We begin by formalizing the problem setting and use the text descriptions of bias groups to establish an unsafe direction for guiding the diffusion process. Next, we simplify the problem into a weight optimization problem and attempt a Reinforcement solver, Policy Gradient, which shows sub-optimal performance with slow convergence. Further, to overcome limitations, we propose an iterative distribution alignment (IDA) method. Despite its simplicity, we show that IDA shows efficiency and fast convergence in resolving the social bias in TTI diffusion models. Our code will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes