CVAICLLGApr 13, 2023

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

AI2Microsoft
arXiv:2304.06671v313 citationsh-index: 85
Originality Incremental advance
AI Analysis

This work addresses a critical limitation in controllable image generation for AI applications, though it is incremental as it builds on existing layout-guided methods.

The authors tackled the problem of layout-guided image generation models failing on out-of-distribution layouts by proposing LayoutBench, a diagnostic benchmark, and IterInpaint, a new method that outperforms state-of-the-art baselines on all four benchmark splits.

Spatial control is a core capability in controllable image generation. Advancements in layout-guided image generation have shown promising results on in-distribution (ID) datasets with similar spatial configurations. However, it is unclear how these models perform when facing out-of-distribution (OOD) samples with arbitrary, unseen layouts. In this paper, we propose LayoutBench, a diagnostic benchmark for layout-guided image generation that examines four categories of spatial control skills: number, position, size, and shape. We benchmark two recent representative layout-guided image generation methods and observe that the good ID layout control may not generalize well to arbitrary layouts in the wild (e.g., objects at the boundary). Next, we propose IterInpaint, a new baseline that generates foreground and background regions step-by-step via inpainting, demonstrating stronger generalizability than existing models on OOD layouts in LayoutBench. We perform quantitative and qualitative evaluation and fine-grained analysis on the four LayoutBench skills to pinpoint the weaknesses of existing models. We show comprehensive ablation studies on IterInpaint, including training task ratio, crop&paste vs. repaint, and generation order. Lastly, we evaluate the zero-shot performance of different pretrained layout-guided image generation models on LayoutBench-COCO, our new benchmark for OOD layouts with real objects, where our IterInpaint consistently outperforms SOTA baselines in all four splits. Project website: https://layoutbench.github.io

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes