The Latent Color Subspace: Emergent Order in High-Dimensional Chaos
This provides a method for better color control in image generation models, but it is incremental as it builds on existing latent space interpretations.
The paper tackled the problem of limited fine-grained control in text-to-image generation by interpreting the color representation in a VAE latent space, revealing a structure that reflects Hue, Saturation, and Lightness, and demonstrated that this Latent Color Subspace can predict and control color with a training-free method.
Text-to-image generation models have advanced rapidly, yet achieving fine-grained control over generated images remains difficult, largely due to limited understanding of how semantic information is encoded. We develop an interpretation of the color representation in the Variational Autoencoder latent space of FLUX.1 [Dev], revealing a structure reflecting Hue, Saturation, and Lightness. We verify our Latent Color Subspace (LCS) interpretation by demonstrating that it can both predict and explicitly control color, introducing a fully training-free method in FLUX based solely on closed-form latent-space manipulation. Code is available at https://github.com/ExplainableML/LCS.