HCAIMay 14, 2025

An Exploration of Default Images in Text-to-Image Generation

arXiv:2505.09166v41 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses a practical issue for users and developers of text-to-image models, though it is incremental as it builds on existing TTI systems.

The paper tackled the problem of default images in text-to-image generation, where models produce similar outputs for unrelated prompts, and found consistent default images across 750,000 images, with user studies showing impacts on satisfaction.

In the creative practice of text-to-image (TTI) generation, images are synthesized from textual prompts. By design, TTI models always yield an output, even if the prompt contains unknown terms. In this case, the model may generate default images: images that closely resemble each other across many unrelated prompts. Studying default images is valuable for designing better solutions for prompt engineering and TTI generation. We present the first investigation into default images on Midjourney. We describe an initial study in which we manually created input prompts triggering default images, and several ablation studies. Building on these, we conduct a computational analysis of about 750,000 images, revealing consistent default images across unrelated prompts. We also conduct an online user study investigating how default images may affect user satisfaction. Our work lays the foundation for understanding default images in TTI generation, highlighting their practical relevance as well as challenges and future research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes