CV IVOct 13, 2024

Tokenizing Motion: A Generative Approach for Scene Dynamics Compression

Shanzhi Yin, Zihan Zhang, Bolin Chen, Shiqi Wang, Yan Ye

arXiv:2410.09768v26.53 citationsh-index: 8Has Code

Originality Highly original

AI Analysis

This addresses ultra-low bitrate communication for diverse video scenes, representing a novel method for a known bottleneck.

The paper tackles video compression by using motion pattern priors from common scene dynamics, achieving superior rate-distortion performance and outperforming the state-of-the-art ECM codec on scene dynamics sequences.

This paper proposes a novel generative video compression framework that leverages motion pattern priors, derived from subtle dynamics in common scenes (e.g., swaying flowers or a boat drifting on water), rather than relying on video content priors (e.g., talking faces or human bodies). These compact motion priors enable a new approach to ultra-low bitrate communication while achieving high-quality reconstruction across diverse scene contents. At the encoder side, motion priors can be streamlined into compact representations via a dense-to-sparse transformation. At the decoder side, these priors facilitate the reconstruction of scene dynamics using an advanced flow-driven diffusion model. Experimental results illustrate that the proposed method can achieve superior rate-distortion-performance and outperform the state-of-the-art conventional-video codec Enhanced Compression Model (ECM) on-scene dynamics sequences. The project page can be found at-https://github.com/xyzysz/GNVDC.

View on arXiv PDF Code

Similar