IVCVLGNov 13, 2021

A strong baseline for image and video quality assessment

arXiv:2111.07104v124 citationsHas Code
Originality Incremental advance
AI Analysis

This work provides a strong baseline for image and video quality assessment, which is incremental as it simplifies existing methods while maintaining competitive performance.

The authors tackled the problem of perceptual quality assessment for images and videos by proposing a simple unified model that uses only one global feature from a backbone network, achieving comparable performance to complex existing models and surpassing current SOTA baselines on public and private datasets.

In this work, we present a simple yet effective unified model for perceptual quality assessment of image and video. In contrast to existing models which usually consist of complex network architecture, or rely on the concatenation of multiple branches of features, our model achieves a comparable performance by applying only one global feature derived from a backbone network (i.e. resnet18 in the presented work). Combined with some training tricks, the proposed model surpasses the current baselines of SOTA models on public and private datasets. Based on the architecture proposed, we release the models well trained for three common real-world scenarios: UGC videos in the wild, PGC videos with compression, Game videos with compression. These three pre-trained models can be directly applied for quality assessment, or be further fine-tuned for more customized usages. All the code, SDK, and the pre-trained weights of the proposed models are publicly available at https://github.com/Tencent/CenseoQoE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes