CVAICLLGRONov 27, 2023

Reinforcement Learning from Diffusion Feedback: Q* for Image Search

arXiv:2311.15648v1h-index: 6
Originality Incremental advance
AI Analysis

This work addresses the need for efficient personalization in image generation without fine-tuning or data augmentation, though it appears incremental as it builds on existing diffusion and reinforcement learning techniques.

The paper tackles the problem of generating high-quality, diverse images from a single input image without text, using reinforcement learning from diffusion feedback (RLDF) and a noisy diffusion gradient method, achieving class-consistent results across domains like retail, sports, and agriculture.

Large vision-language models are steadily gaining personalization capabilities at the cost of fine-tuning or data augmentation. We present two models for image generation using model-agnostic learning that align semantic priors with generative capabilities. RLDF, or Reinforcement Learning from Diffusion Feedback, is a singular approach for visual imitation through prior-preserving reward function guidance. This employs Q-learning (with standard Q*) for generation and follows a semantic-rewarded trajectory for image search through finite encoding-tailored actions. The second proposed method, noisy diffusion gradient, is optimization driven. At the root of both methods is a special CFG encoding that we propose for continual semantic guidance. Using only a single input image and no text input, RLDF generates high-quality images over varied domains including retail, sports and agriculture showcasing class-consistency and strong visual diversity. Project website is available at https://infernolia.github.io/RLDF.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes