CVAIMar 4, 2025

UAR-NVC: A Unified AutoRegressive Framework for Memory-Efficient Neural Video Compression

arXiv:2503.02733v25 citationsh-index: 5IEEE transactions on circuits and systems for video technology (Print)
Originality Incremental advance
AI Analysis

This work addresses memory efficiency for video compression in resource-constrained scenarios, representing an incremental advancement by integrating autoregressive modeling with INRs.

The paper tackles the high memory consumption of Implicit Neural Representations (INRs) in video compression by proposing a unified autoregressive framework that processes videos in clips, reducing memory usage while improving performance over baselines.

Implicit Neural Representations (INRs) have demonstrated significant potential in video compression by representing videos as neural networks. However, as the number of frames increases, the memory consumption for training and inference increases substantially, posing challenges in resource-constrained scenarios. Inspired by the success of traditional video compression frameworks, which process video frame by frame and can efficiently compress long videos, we adopt this modeling strategy for INRs to decrease memory consumption, while aiming to unify the frameworks from the perspective of timeline-based autoregressive modeling. In this work, we present a novel understanding of INR models from an autoregressive (AR) perspective and introduce a Unified AutoRegressive Framework for memory-efficient Neural Video Compression (UAR-NVC). UAR-NVC integrates timeline-based and INR-based neural video compression under a unified autoregressive paradigm. It partitions videos into several clips and processes each clip using a different INR model instance, leveraging the advantages of both compression frameworks while allowing seamless adaptation to either in form. To further reduce temporal redundancy between clips, we design two modules to optimize the initialization, training, and compression of these model parameters. UAR-NVC supports adjustable latencies by varying the clip length. Extensive experimental results demonstrate that UAR-NVC, with its flexible video clip setting, can adapt to resource-constrained environments and significantly improve performance compared to different baseline models. The project page: "https://wj-inf.github.io/UAR-NVC-page/".

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes