RL-ScanIQA: Reinforcement-Learned Scanpaths for Blind 360°Image Quality Assessment
This work addresses the problem of accurately predicting perceptual quality for panoramic images in immersive environments, which is crucial for applications like virtual reality, but it is incremental as it builds on existing scanpath-based approaches.
The paper tackles blind 360° image quality assessment by proposing RL-ScanIQA, a reinforcement learning framework that jointly optimizes scanpath generation and quality assessment, achieving superior in-dataset performance and cross-dataset generalization on three benchmarks.
Blind 360°image quality assessment (IQA) aims to predict perceptual quality for panoramic images without a pristine reference. Unlike conventional planar images, 360°content in immersive environments restricts viewers to a limited viewport at any moment, making viewing behaviors critical to quality perception. Although existing scanpath-based approaches have attempted to model viewing behaviors by approximating the human view-then-rate paradigm, they treat scanpath generation and quality assessment as separate steps, preventing end-to-end optimization and task-aligned exploration. To address this limitation, we propose RL-ScanIQA, a reinforcement-learned framework for blind 360°IQA. RL-ScanIQA optimize a PPO-trained scanpath policy and a quality assessor, where the policy receives quality-driven feedback to learn task-relevant viewing strategies. To improve training stability and prevent mode collapse, we design multi-level rewards, including scanpath diversity and equator-biased priors. We further boost cross-dataset robustness using distortion-space augmentation together with rank-consistent losses that preserve intra-image and inter-image quality orderings. Extensive experiments on three benchmarks show that RL-ScanIQA achieves superior in-dataset performance and cross-dataset generalization. Codes are available at https://github.com/wangyuji1/RLScanIQA.git.