CVAIOct 25, 2025

Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation

arXiv:2510.22107v11 citationsh-index: 38
Originality Incremental advance
AI Analysis

This addresses the challenge of meaningful diversity in conditional image generation for applications like natural and medical imaging, though it is an incremental improvement by integrating GFlowNets into existing models.

The paper tackles the problem of generating diverse images from uncertain conditions by proposing Rainbow, a framework that decomposes conditions into diverse latent representations using GFlowNets, resulting in improved diversity and fidelity across image synthesis tasks.

Capturing diversity is crucial in conditional and prompt-based image generation, particularly when conditions contain uncertainty that can lead to multiple plausible outputs. To generate diverse images reflecting this diversity, traditional methods often modify random seeds, making it difficult to discern meaningful differences between samples, or diversify the input prompt, which is limited in verbally interpretable diversity. We propose Rainbow, a novel conditional image generation framework, applicable to any pretrained conditional generative model, that addresses inherent condition/prompt uncertainty and generates diverse plausible images. Rainbow is based on a simple yet effective idea: decomposing the input condition into diverse latent representations, each capturing an aspect of the uncertainty and generating a distinct image. First, we integrate a latent graph, parameterized by Generative Flow Networks (GFlowNets), into the prompt representation computation. Second, leveraging GFlowNets' advanced graph sampling capabilities to capture uncertainty and output diverse trajectories over the graph, we produce multiple trajectories that collectively represent the input condition, leading to diverse condition representations and corresponding output images. Evaluations on natural image and medical image datasets demonstrate Rainbow's improvement in both diversity and fidelity across image synthesis, image generation, and counterfactual generation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes