CVMMJan 21, 2025

GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting

arXiv:2501.12060v29 citationsh-index: 8NOSSDAV
Originality Incremental advance
AI Analysis

This work addresses video compression efficiency for applications requiring high-speed rendering, though it is incremental as it adapts an existing 3D method to 2D video.

The paper tackles video representation and compression by proposing GSVC, which uses 2D Gaussian splats to represent videos, achieving rate-distortion trade-offs comparable to state-of-the-art codecs like AV1 and VVC and a rendering speed of 1500 fps for 1080p video.

3D Gaussian splats have emerged as a revolutionary, effective, learned representation for static 3D scenes. In this work, we explore using 2D Gaussian splats as a new primitive for representing videos. We propose GSVC, an approach to learning a set of 2D Gaussian splats that can effectively represent and compress video frames. GSVC incorporates the following techniques: (i) To exploit temporal redundancy among adjacent frames, which can speed up training and improve the compression efficiency, we predict the Gaussian splats of a frame based on its previous frame; (ii) To control the trade-offs between file size and quality, we remove Gaussian splats with low contribution to the video quality; (iii) To capture dynamics in videos, we randomly add Gaussian splats to fit content with large motion or newly-appeared objects; (iv) To handle significant changes in the scene, we detect key frames based on loss differences during the learning process. Experiment results show that GSVC achieves good rate-distortion trade-offs, comparable to state-of-the-art video codecs such as AV1 and VVC, and a rendering speed of 1500 fps for a 1920x1080 video.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes