CV LGFeb 14, 2023

Universal Guidance for Diffusion Models

Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein

arXiv:2302.07121v144.5483 citationsh-index: 72Has Code

Originality Incremental advance

AI Analysis

This enables more flexible and efficient use of diffusion models for various applications, though it appears incremental as it builds on existing guidance methods.

The authors tackled the problem of diffusion models being limited to specific conditioning modalities like text, by proposing a universal guidance algorithm that allows control via arbitrary modalities without retraining. They demonstrated successful image generation using guidance from segmentation, face recognition, object detection, and classifier signals.

Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining. In this work, we propose a universal guidance algorithm that enables diffusion models to be controlled by arbitrary guidance modalities without the need to retrain any use-specific components. We show that our algorithm successfully generates quality images with guidance functions including segmentation, face recognition, object detection, and classifier signals. Code is available at https://github.com/arpitbansal297/Universal-Guided-Diffusion.

View on arXiv PDF Code

Similar