Latent Structure Emergence in Diffusion Models via Confidence-Based Filtering
This provides a method for conditional generation in diffusion models, though it is incremental as it builds on existing classifier-based approaches.
The paper tackled the problem of whether the latent space in diffusion models contains structure to predict generated sample classes, finding that filtering initial noise seeds by classifier confidence reveals pronounced class separability.
Diffusion models rely on a high-dimensional latent space of initial noise seeds, yet it remains unclear whether this space contains sufficient structure to predict properties of the generated samples, such as their classes. In this work, we investigate the emergence of latent structure through the lens of confidence scores assigned by a pre-trained classifier to generated samples. We show that while the latent space appears largely unstructured when considering all noise realizations, restricting attention to initial noise seeds that produce high-confidence samples reveals pronounced class separability. By comparing class predictability across noise subsets of varying confidence and examining the class separability of the latent space, we find evidence of class-relevant latent structure that becomes observable only under confidence-based filtering. As a practical implication, we discuss how confidence-based filtering enables conditional generation as an alternative to guidance-based methods.