IVCVFeb 28, 2025

Towards Practical Real-Time Neural Video Compression

arXiv:2502.20762v288 citationsh-index: 15Has CodeCVPR
Originality Incremental advance
AI Analysis

This work addresses the need for efficient, real-time video compression for applications like streaming and communication, though it is incremental by optimizing existing neural codec frameworks.

The paper tackled the problem of making neural video codecs practical for real-time use by addressing operational costs like memory I/O as a bottleneck, resulting in DCVC-RT achieving 125.2/112.8 fps encoding/decoding speeds for 1080p video and saving 21% bitrate compared to H.266/VTM.

We introduce a practical real-time neural video codec (NVC) designed to deliver high compression ratio, low latency and broad versatility. In practice, the coding speed of NVCs depends on 1) computational costs, and 2) non-computational operational costs, such as memory I/O and the number of function calls. While most efficient NVCs prioritize reducing computational cost, we identify operational cost as the primary bottleneck to achieving higher coding speed. Leveraging this insight, we introduce a set of efficiency-driven design improvements focused on minimizing operational costs. Specifically, we employ implicit temporal modeling to eliminate complex explicit motion modules, and use single low-resolution latent representations rather than progressive downsampling. These innovations significantly accelerate NVC without sacrificing compression quality. Additionally, we implement model integerization for consistent cross-device coding and a module-bank-based rate control scheme to improve practical adaptability. Experiments show our proposed DCVC-RT achieves an impressive average encoding/decoding speed at 125.2/112.8 fps (frames per second) for 1080p video, while saving an average of 21% in bitrate compared to H.266/VTM. The code is available at https://github.com/microsoft/DCVC.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes