General and Efficient Steering of Unconditional Diffusion
This addresses the problem of inefficient conditional generation in diffusion models for AI researchers and practitioners, offering a more computationally efficient alternative to existing methods.
The paper tackles the computational overhead of guiding unconditional diffusion models by introducing a method that avoids gradient guidance during inference, enabling fast controllable generation. Experiments on CIFAR-10, ImageNet, and CelebA show improved accuracy/quality over gradient-based guidance with significant inference speedups.
Guiding unconditional diffusion models typically requires either retraining with conditional inputs or per-step gradient computations (e.g., classifier-based guidance), both of which incur substantial computational overhead. We present a general recipe for efficiently steering unconditional diffusion {without gradient guidance during inference}, enabling fast controllable generation. Our approach is built on two observations about diffusion model structure: Noise Alignment: even in early, highly corrupted stages, coarse semantic steering is possible using a lightweight, offline-computed guidance signal, avoiding any per-step or per-sample gradients. Transferable concept vectors: a concept direction in activation space once learned transfers across both {timesteps} and {samples}; the same fixed steering vector learned near low noise level remains effective when injected at intermediate noise levels for every generation trajectory, providing refined conditional control with efficiency. Such concept directions can be efficiently and reliably identified via Recursive Feature Machine (RFM), a light-weight backpropagation-free feature learning method. Experiments on CIFAR-10, ImageNet, and CelebA demonstrate improved accuracy/quality over gradient-based guidance, while achieving significant inference speedups.