CVMay 8

From Pixels to Primitives: Scene Change Detection in 3D Gaussian Splatting

arXiv:2605.0720385.3
Predicted impact top 22% in CV · last 90 daysOriginality Highly original
AI Analysis

For computer vision applications requiring 3D scene change detection, this work introduces a new paradigm that avoids render-then-compare limitations, though it is domain-specific to Gaussian splatting representations.

The authors propose GS-DIFF, a method for scene change detection that operates directly on 3D Gaussian primitives instead of rendered images, achieving multi-view consistent change maps and separate geometric/appearance change scoring. It surpasses prior SOTA by ~17% mIoU on real-world benchmarks.

Scene change detection methods built on Gaussian splatting universally follow a render-then-compare paradigm: the pre-change scene is rendered into 2D and compared against post-change images via pixel or feature residuals. This change detection problem with Gaussian Splatting has been treated as a question about pixels; we treat it as a question about primitives. We provide direct evidence that native primitive attributes alone -- position, anisotropic covariance, and color -- carry sufficient signal for scene change detection. What makes primitive-space comparison hard is the under-constrained nature of Gaussian splatting representation: independent optimizations yield primitive solutions whose count, positions, shapes, and colors differ even where nothing has changed. We address this challenge with anisotropic models of geometric and photometric drift, complemented by a per-primitive observability term that reflects the extent to which each Gaussian is constrained by the camera geometry. Operating directly on primitives gives our method, GD-DIFF, two properties that distinguish it from render-then-compare methods. First, change maps are multi-view consistent by construction, where prior work had to learn this through an additional optimization objective. Second, geometric and appearance changes are scored separately, identifying not just where but what kind of change occurred, distinguishing structural changes (e.g., an added object) from surface-level ones (e.g., a color change) without supervision or external model dependencies. On real-world benchmarks, GS-DIFF surpasses the prior state-of-the-art approach by approximatelt 17% in mean Intersection over Union.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes