CV AIApr 9

RewardFlow: Generate Images by Optimizing What You Reward

Onkar Susladkar, Dong-Hwan Jang, Tushar Prakash, Adheesh Juvekar, Vedant Shah, Ayush Barik, Nabeel Bashir, Muntasir Wahed, Ritish Shrirao, Ismini Lourentzou

arXiv:2604.0853687.0

Predicted impact top 19% in CV · last 90 daysOriginality Highly original

AI Analysis

This addresses the challenge of improving image editing and compositional generation for AI applications, representing a novel method rather than an incremental improvement.

The authors tackled the problem of steering pretrained diffusion and flow-matching models at inference time by introducing RewardFlow, a framework that uses multi-reward Langevin dynamics to unify complementary differentiable rewards for semantic alignment, perceptual fidelity, and other objectives, achieving state-of-the-art edit fidelity and compositional alignment on benchmarks.

We introduce RewardFlow, an inversion-free framework that steers pretrained diffusion and flow-matching models at inference time through multi-reward Langevin dynamics. RewardFlow unifies complementary differentiable rewards for semantic alignment, perceptual fidelity, localized grounding, object consistency, and human preference, and further introduces a differentiable VQA-based reward that provides fine-grained semantic supervision through language-vision reasoning. To coordinate these heterogeneous objectives, we design a prompt-aware adaptive policy that extracts semantic primitives from the instruction, infers edit intent, and dynamically modulates reward weights and step sizes throughout sampling. Across several image editing and compositional generation benchmarks, RewardFlow delivers state-of-the-art edit fidelity and compositional alignment.

View on arXiv PDF

Similar