CVMar 6, 2025

GaussianVideo: Efficient Video Representation and Compression by Gaussian Splatting

arXiv:2503.04333v15 citationsh-index: 22025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Incremental advance
AI Analysis

This addresses efficiency bottlenecks for practical video compression applications, representing an incremental improvement over existing neural video representation methods.

The paper tackles the problem of slow encoding/decoding and high memory consumption in implicit neural video representation methods by proposing a 2D Gaussian Splatting approach, achieving 78.4% GPU memory reduction, 5.5x faster training, and 12.5x faster decoding compared to state-of-the-art NeRV methods.

Implicit Neural Representation for Videos (NeRV) has introduced a novel paradigm for video representation and compression, outperforming traditional codecs. As model size grows, however, slow encoding and decoding speed and high memory consumption hinder its application in practice. To address these limitations, we propose a new video representation and compression method based on 2D Gaussian Splatting to efficiently handle video data. Our proposed deformable 2D Gaussian Splatting dynamically adapts the transformation of 2D Gaussians at each frame, significantly reducing memory cost. Equipped with a multi-plane-based spatiotemporal encoder and a lightweight decoder, it predicts changes in color, coordinates, and shape of initialized Gaussians, given the time step. By leveraging temporal gradients, our model effectively captures temporal redundancy at negligible cost, significantly enhancing video representation efficiency. Our method reduces GPU memory usage by up to 78.4%, and significantly expedites video processing, achieving 5.5x faster training and 12.5x faster decoding compared to the state-of-the-art NeRV methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes