CLCVCYApr 15, 2025

Using LLMs as prompt modifier to avoid biases in AI image generators

arXiv:2504.11104v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses bias issues in text-to-image generation for users seeking more diverse and fair outputs, though it is incremental as it builds on existing LLM and image generator technologies.

The study tackled bias in AI image generators by using Large Language Models (LLMs) to modify user prompts, resulting in significantly increased image diversity and reduced bias without altering the generators themselves, as demonstrated with models like Stable Diffusion XL, 3.5, and Flux.

This study examines how Large Language Models (LLMs) can reduce biases in text-to-image generation systems by modifying user prompts. We define bias as a model's unfair deviation from population statistics given neutral prompts. Our experiments with Stable Diffusion XL, 3.5 and Flux demonstrate that LLM-modified prompts significantly increase image diversity and reduce bias without the need to change the image generators themselves. While occasionally producing results that diverge from original user intent for elaborate prompts, this approach generally provides more varied interpretations of underspecified requests rather than superficial variations. The method works particularly well for less advanced image generators, though limitations persist for certain contexts like disability representation. All prompts and generated images are available at https://iisys-hof.github.io/llm-prompt-img-gen/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes