SR-NeRV: Improving Embedding Efficiency of Neural Video Representation via Super-Resolution
This work addresses a bottleneck in neural video compression for practical applications, though it is incremental as it builds on existing INR methods.
The paper tackled the problem of high-frequency detail loss in neural video compression under model size constraints by integrating a super-resolution network into an INR-based framework, resulting in improved reconstruction quality while maintaining comparable model size.
Implicit Neural Representations (INRs) have garnered significant attention for their ability to model complex signals in various domains. Recently, INR-based frameworks have shown promise in neural video compression by embedding video content into compact neural networks. However, these methods often struggle to reconstruct high-frequency details under stringent constraints on model size, which are critical in practical compression scenarios. To address this limitation, we propose an INR-based video representation framework that integrates a general-purpose super-resolution (SR) network. This design is motivated by the observation that high-frequency components tend to exhibit low temporal redundancy across frames. By offloading the reconstruction of fine details to a dedicated SR network pre-trained on natural images, the proposed method improves visual fidelity. Experimental results demonstrate that the proposed method outperforms conventional INR-based baselines in reconstruction quality, while maintaining a comparable model size.