CVMay 25

R5DGS: Semantic-Aware 4D Gaussian Splatting with Rigid Body Constraints for Efficient Dynamic Scene Reconstruction

arXiv:2605.2590935.8
AI Analysis

This work addresses efficient semantic-aware dynamic scene reconstruction for robotics, AR/VR, and digital twins, offering a novel method that reduces computational overhead while enabling object-level interaction.

R5DGS introduces a physics-informed 4D Gaussian representation with identity encoding and CLIP-based object lookup for open-vocabulary semantic segmentation in dynamic scenes, achieving 11 FPS speedup in extrapolation while maintaining trajectory plausibility.

Reconstructing and predicting dynamic 3D scenes from multi-view videos is a foundational task for robotics, AR/VR, and digital twins. Recent physics-informed Gaussian Splatting methods achieve impressive future frame extrapolation but lack semantic awareness and suffer from large computational overhead. We introduce $\textbf{R5DGS}$, a framework that augments a physics-driven 4D Gaussian representation with compact Identity Encoding vectors, enabling precise Gaussian-to-object association. By constructing an offline CLIP-based object lookup table, we support open-vocabulary text prompting to retrieve and render object-specific Gaussians across arbitrary timestamps and viewpoints. Furthermore, we propose a rigid-body inference constraint that predicts and integrates physical dynamics exclusively for object centroids, propagating motion to associated Gaussians via relative transformations. This optimization yields a 11 FPS speedup during extrapolation without compromising trajectories plausibility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes