CVMar 10, 2025

AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis

arXiv:2503.07253v26 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses the problem of anomaly synthesis for industrial inspection, offering a novel approach that breaks the diversity-realism trade-off, though it is incremental in building on existing vision-language and diffusion techniques.

The paper tackles the challenge of synthesizing realistic and diverse anomalies in industrial images by proposing AnomalyPainter, a zero-shot framework that integrates vision-language models, latent diffusion, and a texture library, resulting in outperforming existing methods in realism, diversity, and generalization.

While existing anomaly synthesis methods have made remarkable progress, achieving both realism and diversity in synthesis remains a major obstacle. To address this, we propose AnomalyPainter, a zero-shot framework that breaks the diversity-realism trade-off dilemma through synergizing Vision Language Large Model (VLLM), Latent Diffusion Model (LDM), and our newly introduced texture library Tex-9K. Tex-9K is a professional texture library containing 75 categories and 8,792 texture assets crafted for diverse anomaly synthesis. Leveraging VLLM's general knowledge, reasonable anomaly text descriptions are generated for each industrial object and matched with relevant diverse textures from Tex-9K. These textures then guide the LDM via ControlNet to paint on normal images. Furthermore, we introduce Texture-Aware Latent Init to stabilize the natural-image-trained ControlNet for industrial images. Extensive experiments show that AnomalyPainter outperforms existing methods in realism, diversity, and generalization, achieving superior downstream performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes