LGCVCOMLMay 11

Couple to Control: Joint Initial Noise Design in Diffusion Models

arXiv:2605.1131168.7
AI Analysis

For practitioners using diffusion models, this provides a zero-cost way to improve diversity in generated batches, with applications in image generation and editing.

The paper introduces a framework for designing coupled initial noises in diffusion models, where noises remain marginally Gaussian but are dependent across samples. Repulsive Gaussian coupling improves gallery diversity on SD1.5, SDXL, and SD3 while preserving prompt alignment and image quality, matching or outperforming test-time noise-optimization baselines at no extra sampling cost.

Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specify a coupling of the initial noises: each noise remains marginally standard Gaussian, so the pretrained diffusion model receives the same single-sample input distribution, while the dependence across samples is chosen by design. This reframes initial-noise control from selecting or optimizing individual seeds to designing the dependence structure of a multi-sample gallery. This view gives a general framework for initial-noise design, covering several existing methods as special cases and leading naturally to new coupled-noise constructions. Coupled noise can improve generation on its own without adding sampling cost, and it is flexible enough to serve as a structured initialization for optimization-based pipelines when additional computation is available. Empirically, repulsive Gaussian coupling improves gallery diversity on SD1.5, SDXL, and SD3 while largely preserving prompt alignment and image quality. It matches or outperforms recent test-time noise-optimization baselines on several diversity metrics at the same sampling cost as independent generation. Subspace couplings also support fixed-object background generation, producing diverse, natural backgrounds compared with specialized inpainting baselines, with a tunable trade-off in foreground fidelity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes