CVIVAug 23, 2022

Quality-Constant Per-Shot Encoding by Two-Pass Learning-based Rate Factor Prediction

arXiv:2208.10739v13 citationsh-index: 71
Originality Incremental advance
AI Analysis

This work addresses the challenge of maintaining consistent user experience while optimizing bit-rate usage in video streaming, representing an incremental improvement over existing encoding techniques.

The paper tackles the problem of ensuring constant video quality across segments by proposing a two-pass learning-based method to predict rate factor parameters, achieving 98.88% accuracy in keeping VMAF within ±1 of the target with only 1.55 times average encoding complexity.

Providing quality-constant streams can simultaneously guarantee user experience and prevent wasting bit-rate. In this paper, we propose a novel deep learning based two-pass encoder parameter prediction framework to decide rate factor (RF), with which encoder can output streams with constant quality. For each one-shot segment in a video, the proposed method firstly extracts spatial, temporal and pre-coding features by an ultra fast pre-process. Based on these features, a RF parameter is predicted by a deep neural network. Video encoder uses the RF to compress segment as the first encoding pass. Then VMAF quality of the first pass encoding is measured. If the quality doesn't meet target, a second pass RF prediction and encoding will be performed. With the help of first pass predicted RF and corresponding actual quality as feedback, the second pass prediction will be highly accurate. Experiments show the proposed method requires only 1.55 times encoding complexity on average, meanwhile the accuracy, that the compressed video's actual VMAF is within $\pm1$ around the target VMAF, reaches 98.88%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes