IVCVLGJun 15, 2019

Speeding up VP9 Intra Encoder with Hierarchical Deep Learning Based Partition Prediction

arXiv:1906.06476v215 citations
AI Analysis

This work addresses encoding speed for video compression applications, offering a domain-specific improvement that outperforms existing fast encoding levels.

The paper tackles the computational intensity of VP9 video encoding by proposing a deep learning framework to predict superblock partitions, achieving a 69.7% speedup in intra-mode encoding with a 1.71% increase in bitrate.

In VP9 video codec, the sizes of blocks are decided during encoding by recursively partitioning 64$\times$64 superblocks using rate-distortion optimization (RDO). This process is computationally intensive because of the combinatorial search space of possible partitions of a superblock. Here, we propose a deep learning based alternative framework to predict the intra-mode superblock partitions in the form of a four-level partition tree, using a hierarchical fully convolutional network (H-FCN). We created a large database of VP9 superblocks and the corresponding partitions to train an H-FCN model, which was subsequently integrated with the VP9 encoder to reduce the intra-mode encoding time. The experimental results establish that our approach speeds up intra-mode encoding by 69.7% on average, at the expense of a 1.71% increase in the Bjontegaard-Delta bitrate (BD-rate). While VP9 provides several built-in speed levels which are designed to provide faster encoding at the expense of decreased rate-distortion performance, we find that our model is able to outperform the fastest recommended speed level of the reference VP9 encoder for the good quality intra encoding configuration, in terms of both speedup and BD-rate.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes