LGFeb 10

ELROND: Exploring and decomposing intrinsic capabilities of diffusion models

Paweł Skierś, Tomasz Trzciński, Kamil Deja

arXiv:2602.10216v12 citations

Originality Incremental advance

AI Analysis

This work addresses the lack of fine-grained control in diffusion models for users, offering incremental improvements in interpretability and diversity.

The authors tackled the problem of uncontrolled semantic variations in diffusion model outputs from a single text prompt by proposing a framework to disentangle semantic directions in the input embedding space, resulting in interpretable steering directions, mitigation of mode collapse, and a novel estimator for concept complexity.

A single text prompt passed to a diffusion model often yields a wide range of visual outputs determined solely by stochastic process, leaving users with no direct control over which specific semantic variations appear in the image. While existing unsupervised methods attempt to analyze these variations via output features, they omit the underlying generative process. In this work, we propose a framework to disentangle these semantic directions directly within the input embedding space. To that end, we collect a set of gradients obtained by backpropagating the differences between stochastic realizations of a fixed prompt that we later decompose into meaningful steering directions with either Principal Components Analysis or Sparse Autoencoder. Our approach yields three key contributions: (1) it isolates interpretable, steerable directions for precise, fine-grained control over a single concept; (2) it effectively mitigates mode collapse in distilled models by reintroducing lost diversity; and (3) it establishes a novel estimator for concept complexity under a specific model, based on the dimensionality of the discovered subspace.

View on arXiv PDF

Similar