OARS: Process-Aware Online Alignment for Generative Real-World Image Super-Resolution
This work addresses the perception-fidelity trade-off in image super-resolution for applications requiring high-quality visual outputs, representing a strong specific gain rather than a broad paradigm shift.
The paper tackles the challenge of aligning generative real-world image super-resolution models with human visual preferences by proposing OARS, a process-aware online alignment framework, which achieves state-of-the-art performance on Real-ISR benchmarks with consistent perceptual improvements while maintaining fidelity.
Aligning generative real-world image super-resolution models with human visual preference is challenging due to the perception--fidelity trade-off and diverse, unknown degradations. Prior approaches rely on offline preference optimization and static metric aggregation, which are often non-interpretable and prone to pseudo-diversity under strong conditioning. We propose OARS, a process-aware online alignment framework built on COMPASS, a MLLM-based reward that evaluates the LR to SR transition by jointly modeling fidelity preservation and perceptual gain with an input-quality-adaptive trade-off. To train COMPASS, we curate COMPASS-20K spanning synthetic and real degradations, and introduce a three-stage perceptual annotation pipeline that yields calibrated, fine-grained training labels. Guided by COMPASS, OARS performs progressive online alignment from cold-start flow matching to full-reference and finally reference-free RL via shallow LoRA optimization for on-policy exploration. Extensive experiments and user studies demonstrate consistent perceptual improvements while maintaining fidelity, achieving state-of-the-art performance on Real-ISR benchmarks.