A Combined Deep Learning based End-to-End Video Coding Architecture for YUV Color Space
This addresses the need for fair comparison in video coding standards, though it is incremental as it adapts existing methods to a specific color space.
The paper tackles the problem of deep learning-based video coding for YUV 4:2:0 format, introducing a new architecture that outperforms HEVC in intra-frame coding but is less efficient in inter-frame coding.
Most of the existing deep learning based end-to-end video coding (DLEC) architectures are designed specifically for RGB color format, yet the video coding standards, including H.264/AVC, H.265/HEVC and H.266/VVC developed over past few decades, have been designed primarily for YUV 4:2:0 format, where the chrominance (U and V) components are subsampled to achieve superior compression performances considering the human visual system. While a broad number of papers on DLEC compare these two distinct coding schemes in RGB domain, it is ideal to have a common evaluation framework in YUV 4:2:0 domain for a more fair comparison. This paper introduces a new DLEC architecture for video coding to effectively support YUV 4:2:0 and compares its performance against the HEVC standard under a common evaluation framework. The experimental results on YUV 4:2:0 video sequences show that the proposed architecture can outperform HEVC in intra-frame coding, however inter-frame coding is not as efficient on contrary to the RGB coding results reported in recent papers.