CVMar 19

Inst4DGS: Instance-Decomposed 4D Gaussian Splatting with Multi-Video Label Permutation Learning

arXiv:2603.1840243.3h-index: 5
AI Analysis

This work solves the challenge of consistent identity preservation in dynamic 3D scene reconstruction for applications like tracking and segmentation, representing an incremental advance in the field.

The paper tackles the problem of instance-decomposed 4D Gaussian Splatting by addressing inconsistent instance labels across multi-view videos, achieving state-of-the-art rendering and segmentation quality with improvements such as PSNR from 26.10 to 28.36 and instance mIoU from 0.6310 to 0.9129 on the Panoptic Studio dataset.

We present Inst4DGS, an instance-decomposed 4D Gaussian Splatting (4DGS) approach with long-horizon per-Gaussian trajectories. While dynamic 4DGS has advanced rapidly, instance-decomposed 4DGS remains underexplored, largely due to the difficulty of associating inconsistent instance labels across independently segmented multi-view videos. We address this challenge by introducing per-video label-permutation latents that learn cross-video instance matches through a differentiable Sinkhorn layer, enabling direct multi-view supervision with consistent identity preservation. This explicit label alignment yields sharp decision boundaries and temporally stable identities without identity drift. To further improve efficiency, we propose instance-decomposed motion scaffolds that provide low-dimensional motion bases per object for long-horizon trajectory optimization. Experiments on Panoptic Studio and Neural3DV show that Inst4DGS jointly supports tracking and instance decomposition while achieving state-of-the-art rendering and segmentation quality. On the Panoptic Studio dataset, Inst4DGS improves PSNR from 26.10 to 28.36, and instance mIoU from 0.6310 to 0.9129, over the strongest baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes