CVSep 26, 2025

FailureAtlas:Mapping the Failure Landscape of T2I Models via Active Exploration

Muxi Chen, Zhaohua Zhang, Chenchen Zhao, Mingyang Chen, Wenyu Jiang, Tianwen Jiang, Jianhuan Zhuo, Yu Tang, Qiuyong Xiao, Jihong Zhang, Qiang Xu

arXiv:2509.21995v11 citationsh-index: 8Has Code

Originality Highly original

AI Analysis

This provides a diagnostic-first methodology for improving robustness in generative AI, addressing a critical need for developers and researchers in the field.

The paper tackles the problem of limited diagnostic power in static benchmarks for Text-to-Image (T2I) models by introducing FailureAtlas, a framework that actively explores and maps systematic failures, uncovering over 247,000 error slices in Stable Diffusion 1.5 and linking them to training data scarcity.

Static benchmarks have provided a valuable foundation for comparing Text-to-Image (T2I) models. However, their passive design offers limited diagnostic power, struggling to uncover the full landscape of systematic failures or isolate their root causes. We argue for a complementary paradigm: active exploration. We introduce FailureAtlas, the first framework designed to autonomously explore and map the vast failure landscape of T2I models at scale. FailureAtlas frames error discovery as a structured search for minimal, failure-inducing concepts. While it is a computationally explosive problem, we make it tractable with novel acceleration techniques. When applied to Stable Diffusion models, our method uncovers hundreds of thousands of previously unknown error slices (over 247,000 in SD1.5 alone) and provides the first large-scale evidence linking these failures to data scarcity in the training set. By providing a principled and scalable engine for deep model auditing, FailureAtlas establishes a new, diagnostic-first methodology to guide the development of more robust generative AI. The code is available at https://github.com/cure-lab/FailureAtlas

View on arXiv PDF Code

Similar