GIFStream: 4D Gaussian-based Immersive Video with Feature Stream
This addresses storage and quality issues in immersive video technology, offering an incremental improvement for applications like VR and AR.
The paper tackles the challenge of maintaining quality with manageable storage in 4D Gaussian Splatting for immersive video by introducing GIFStream, a representation using canonical space, deformation fields, and feature streams, achieving high-quality immersive video at 30 Mbps with real-time rendering and fast decoding on an RTX 4090.
Immersive video offers a 6-Dof-free viewing experience, potentially playing a key role in future video technology. Recently, 4D Gaussian Splatting has gained attention as an effective approach for immersive video due to its high rendering efficiency and quality, though maintaining quality with manageable storage remains challenging. To address this, we introduce GIFStream, a novel 4D Gaussian representation using a canonical space and a deformation field enhanced with time-dependent feature streams. These feature streams enable complex motion modeling and allow efficient compression by leveraging temporal correspondence and motion-aware pruning. Additionally, we incorporate both temporal and spatial compression networks for end-to-end compression. Experimental results show that GIFStream delivers high-quality immersive video at 30 Mbps, with real-time rendering and fast decoding on an RTX 4090. Project page: https://xdimlab.github.io/GIFStream