Order Is Not Layout: Order-to-Space Bias in Image Generation

arXiv:2603.03714v1h-index: 7
Originality Highly original
AI Analysis

This addresses a critical reliability issue for users of image generation models, such as artists and designers, by exposing and mitigating a data-driven bias that leads to incorrect layouts.

The paper identifies a systematic bias in image generation models where the order of entities in text prompts spuriously determines spatial layout and role assignments, often causing incorrect outputs. It introduces OTS-Bench to quantify this bias and shows that targeted fine-tuning and early-stage interventions can substantially reduce it while maintaining generation quality.

We study a systematic bias in modern image generation models: the mention order of entities in text spuriously determines spatial layout and entity--role binding. We term this phenomenon Order-to-Space Bias (OTS) and show that it arises in both text-to-image and image-to-image generation, often overriding grounded cues and causing incorrect layouts or swapped assignments. To quantify OTS, we introduce OTS-Bench, which isolates order effects with paired prompts differing only in entity order and evaluates models along two dimensions: homogenization and correctness. Experiments show that Order-to-Space Bias (OTS) is widespread in modern image generation models, and provide evidence that it is primarily data-driven and manifests during the early stages of layout formation. Motivated by this insight, we show that both targeted fine-tuning and early-stage intervention strategies can substantially reduce OTS, while preserving generation quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes