CVAIFeb 5

Bidirectional Reward-Guided Diffusion for Real-World Image Super-Resolution

arXiv:2602.07069v12 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses real-world image super-resolution for applications like photography or vision systems, but it is incremental as it builds on existing diffusion and reward feedback methods.

The authors tackled the problem of diffusion-based super-resolution failing on real-world low-resolution images due to distribution shifts, proposing Bird-SR, a bidirectional reward-guided diffusion framework that outperforms state-of-the-art methods in perceptual quality while preserving structural consistency on real-world benchmarks.

Diffusion-based super-resolution can synthesize rich details, but models trained on synthetic paired data often fail on real-world LR images due to distribution shifts. We propose Bird-SR, a bidirectional reward-guided diffusion framework that formulates super-resolution as trajectory-level preference optimization via reward feedback learning (ReFL), jointly leveraging synthetic LR-HR pairs and real-world LR images. For structural fidelity easily affected in ReFL, the model is directly optimized on synthetic pairs at early diffusion steps, which also facilitates structure preservation for real-world inputs under smaller distribution gap in structure levels. For perceptual enhancement, quality-guided rewards are applied at later sampling steps to both synthetic and real LR images. To mitigate reward hacking, the rewards for synthetic results are formulated in a relative advantage space bounded by their clean counterparts, while real-world optimization is regularized via a semantic alignment constraint. Furthermore, to balance structural and perceptual learning, we adopt a dynamic fidelity-perception weighting strategy that emphasizes structure preservation at early stages and progressively shifts focus toward perceptual optimization at later diffusion steps. Extensive experiments on real-world SR benchmarks demonstrate that Bird-SR consistently outperforms state-of-the-art methods in perceptual quality while preserving structural consistency, validating its effectiveness for real-world super-resolution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes