CVFeb 16, 2023

LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation

arXiv:2302.08908v192 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses the problem of generating high-quality, layout-aligned images for applications in design and visualization, representing an incremental advancement in adapting existing models.

The paper tackles layout-to-image generation by adapting foundational diffusion models with a neural adaptor using layout attention and task-aware prompts, achieving significant performance improvements over 10 other generative models across three datasets.

Layout-to-image generation refers to the task of synthesizing photo-realistic images based on semantic layouts. In this paper, we propose LayoutDiffuse that adapts a foundational diffusion model pretrained on large-scale image or text-image datasets for layout-to-image generation. By adopting a novel neural adaptor based on layout attention and task-aware prompts, our method trains efficiently, generates images with both high perceptual quality and layout alignment, and needs less data. Experiments on three datasets show that our method significantly outperforms other 10 generative models based on GANs, VQ-VAE, and diffusion models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes