CVLGOct 13, 2022

Scalable Neural Video Representations with Learnable Positional Features

arXiv:2210.06823v144 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient video encoding for applications like compression and editing, though it appears incremental as it builds on existing CNR methods.

The paper tackles the challenge of making coordinate-based neural representations (CNRs) for videos more compute-efficient, high-quality, and parameter-efficient by introducing learnable positional features, achieving a 2x faster training time and improved encoding quality from 34.07 to 34.57 PSNR with over 8 times fewer parameters.

Succinct representation of complex signals using coordinate-based neural representations (CNRs) has seen great progress, and several recent efforts focus on extending them for handling videos. Here, the main challenge is how to (a) alleviate a compute-inefficiency in training CNRs to (b) achieve high-quality video encoding while (c) maintaining the parameter-efficiency. To meet all requirements (a), (b), and (c) simultaneously, we propose neural video representations with learnable positional features (NVP), a novel CNR by introducing "learnable positional features" that effectively amortize a video as latent codes. Specifically, we first present a CNR architecture based on designing 2D latent keyframes to learn the common video contents across each spatio-temporal axis, which dramatically improves all of those three requirements. Then, we propose to utilize existing powerful image and video codecs as a compute-/memory-efficient compression procedure of latent codes. We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07$\rightarrow$34.57 (measured with the PSNR metric), even using $>$8 times fewer parameters. We also show intriguing properties of NVP, e.g., video inpainting, video frame interpolation, etc.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes