MMMar 10, 2017

A Convolutional Neural Network Approach for Half-Pel Interpolation in Video Coding

arXiv:1703.03502v149 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of improving video compression efficiency for coding standards, though it is incremental as it builds on existing CNN applications in vision tasks.

The paper tackled the inefficiency of fixed interpolation filters in video coding by proposing a CNN-based interpolation filter, achieving up to 3.2% and on average 0.9% BD-rate reduction compared to HEVC's standard filter.

Motion compensation is a fundamental technology in video coding to remove the temporal redundancy between video frames. To further improve the coding efficiency, sub-pel motion compensation has been utilized, which requires interpolation of fractional samples. The video coding standards usually adopt fixed interpolation filters that are derived from the signal processing theory. However, as video signal is not stationary, the fixed interpolation filters may turn out less efficient. Inspired by the great success of convolutional neural network (CNN) in computer vision, we propose to design a CNN-based interpolation filter (CNNIF) for video coding. Different from previous studies, one difficulty for training CNNIF is the lack of ground-truth since the fractional samples are actually not available. Our solution for this problem is to derive the "ground-truth" of fractional samples by smoothing high-resolution images, which is verified to be effective by the conducted experiments. Compared to the fixed half-pel interpolation filter for luma in High Efficiency Video Coding (HEVC), our proposed CNNIF achieves up to 3.2% and on average 0.9% BD-rate reduction under low-delay P configuration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes