CVIVMar 26, 2024

Low-Latency Neural Stereo Streaming

arXiv:2403.17879v17 citationsh-index: 9CVPR
Originality Highly original
AI Analysis

This work addresses the need for low-latency, efficient multi-view video compression for applications like virtual reality and autonomous driving, representing a novel method for a known bottleneck.

The paper tackled the problem of inefficient stereo video compression by proposing a parallel coding method that processes left and right views simultaneously, reducing latency and improving rate-distortion performance compared to existing neural and conventional codecs.

The rise of new video modalities like virtual reality or autonomous driving has increased the demand for efficient multi-view video compression methods, both in terms of rate-distortion (R-D) performance and in terms of delay and runtime. While most recent stereo video compression approaches have shown promising performance, they compress left and right views sequentially, leading to poor parallelization and runtime performance. This work presents Low-Latency neural codec for Stereo video Streaming (LLSS), a novel parallel stereo video coding method designed for fast and efficient low-latency stereo video streaming. Instead of using a sequential cross-view motion compensation like existing methods, LLSS introduces a bidirectional feature shifting module to directly exploit mutual information among views and encode them effectively with a joint cross-view prior model for entropy coding. Thanks to this design, LLSS processes left and right views in parallel, minimizing latency; all while substantially improving R-D performance compared to both existing neural and conventional codecs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes