Video compression dataset and benchmark of learning-based video-quality metrics
This provides a standardized evaluation framework for video compression metrics, addressing a gap in using outdated data, which is incremental but important for video processing research and industry.
The paper tackles the problem of evaluating video-quality metrics by introducing a new benchmark based on a dataset of 2,500 streams encoded with modern standards, showing that new no-reference metrics achieve high correlation with subjective quality and approach top full-reference metrics.
Video-quality measurement is a critical task in video processing. Nowadays, many implementations of new encoding standards - such as AV1, VVC, and LCEVC - use deep-learning-based decoding algorithms with perceptual metrics that serve as optimization objectives. But investigations of the performance of modern video- and image-quality metrics commonly employ videos compressed using older standards, such as AVC. In this paper, we present a new benchmark for video-quality metrics that evaluates video compression. It is based on a new dataset consisting of about 2,500 streams encoded using different standards, including AVC, HEVC, AV1, VP9, and VVC. Subjective scores were collected using crowdsourced pairwise comparisons. The list of evaluated metrics includes recent ones based on machine learning and neural networks. The results demonstrate that new no-reference metrics exhibit a high correlation with subjective quality and approach the capability of top full-reference metrics.