OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects
This addresses the challenge of online 3D reconstruction for arbitrary object motion, which is incremental as it builds on existing pose-free methods but with enhanced performance and efficiency.
The paper tackles the problem of reconstructing 3D models of free-moving objects from monocular video without camera pose or depth cues, achieving significant improvements over state-of-the-art pose-free baselines while maintaining constant computational cost regardless of video length.
Free-moving object reconstruction from monocular video remains challenging, particularly without reliable pose or depth cues and under arbitrary object motion. We introduce OnlineSplatter, a novel online feed-forward framework generating high-quality, object-centric 3D Gaussians directly from RGB frames without requiring camera pose, depth priors, or bundle optimization. Our approach anchors reconstruction using the first frame and progressively refines the object representation through a dense Gaussian primitive field, maintaining constant computational cost regardless of video sequence length. Our core contribution is a dual-key memory module combining latent appearance-geometry keys with explicit directional keys, robustly fusing current frame features with temporally aggregated object states. This design enables effective handling of free-moving objects via spatial-guided memory readout and an efficient sparsification mechanism, ensuring comprehensive yet compact object coverage. Evaluations on real-world datasets demonstrate that OnlineSplatter significantly outperforms state-of-the-art pose-free reconstruction baselines, consistently improving with more observations while maintaining constant memory and runtime.