IV CVMar 6

Enhancing Neural Video Compression of Static Scenes with Positive-Incentive Noise

Cheng Yuan, Zhenyu Jia, Jiawei Shao, Xuelong Li

arXiv:2603.06095v1h-index: 3

Predicted impact top 18% in IV · last 90 daysOriginality Incremental advance

AI Analysis

This work provides a solution for robust video transmission and economic long-term retention of surveillance footage, which is critical for authenticity-critical applications where hallucinated details are unacceptable.

This paper addresses the inefficient encoding of static scene videos by existing compression methods, proposing a positive-incentive noise incorporation into neural video compression. This approach reinterprets short-term temporal changes as noise to facilitate model finetuning, resulting in a 73% Bjøntegaard delta (BD) rate saving compared to general NVC models.

Static scene videos, such as surveillance feeds and videotelephony streams, constitute a dominant share of storage consumption and network traffic. However, both traditional standardized codecs and neural video compression (NVC) methods struggle to encode these videos efficiently due to inadequate usage of temporal redundancy and severe distribution gaps between training and test data, respectively. While recent generative compression methods improve perceptual quality, they introduce hallucinated details that are unacceptable in authenticity-critical applications. To overcome these limitations, we propose to incorporate positive-incentive noise into NVC for static scene videos, where short-term temporal changes are reinterpreted as positive-incentive noise to facilitate model finetuning. By disentangling transient variations from the persistent background, structured prior information is internalized in the compression model. During inference, the invariant component requires minimal signaling, thus reducing data transmission while maintaining pixel-level fidelity. Preliminary experiments demonstrate a 73% Bjøntegaard delta (BD) rate saving compared to general NVC models. Our method provides an effective solution to trade computation for bandwidth, enabling robust video transmission under adverse network conditions and economic long-term retention of surveillance footage.

View on arXiv PDF

Similar