IVCVMMFeb 12

Learning Perceptual Representations for Gaming NR-VQA with Multi-Task FR Signals

arXiv:2602.11903v2h-index: 8
AI Analysis

This addresses video quality assessment for gaming content, which has unique challenges like fast motion and stylized graphics, though it appears incremental as it builds on existing multi-task learning approaches.

The paper tackles the challenge of no-reference video quality assessment for gaming videos by proposing MTL-VQA, a multi-task learning framework that uses full-reference metrics as supervisory signals to learn perceptually meaningful features without human labels. Experiments show it achieves performance competitive with state-of-the-art NR-VQA methods across various settings.

No-reference video quality assessment (NR-VQA) for gaming videos is challenging due to limited human-rated datasets and unique content characteristics including fast motion, stylized graphics, and compression artifacts. We present MTL-VQA, a multi-task learning framework that uses full-reference metrics as supervisory signals to learn perceptually meaningful features without human labels for pretraining. By jointly optimizing multiple full-reference (FR) objectives with adaptive task weighting, our approach learns shared representations that transfer effectively to NR-VQA. Experiments on gaming video datasets show MTL-VQA achieves performance competitive with state-of-the-art NR-VQA methods across both MOS-supervised and label-efficient/self-supervised settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes