CVNov 26, 2024

DROID-Splat: Combining end-to-end SLAM with 3D Gaussian Splatting

arXiv:2411.17660v214.719 citationsh-index: 3Has Code2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and accurate SLAM systems in robotics and computer vision applications, representing an incremental improvement by integrating existing techniques.

The paper tackles the problem of achieving optimal trade-offs between robustness, speed, and accuracy in SLAM systems for monocular video by combining an end-to-end tracker with a 3D Gaussian Splatting renderer, resulting in state-of-the-art tracking and rendering performance on common benchmarks.

Recent progress in scene synthesis makes standalone SLAM systems purely based on optimizing hyperprimitives with a Rendering objective possible. However, the tracking performance still lacks behind traditional and end-to-end SLAM systems. An optimal trade-off between robustness, speed and accuracy has not yet been reached, especially for monocular video. In this paper, we introduce a SLAM system based on an end-to-end Tracker and extend it with a Renderer based on recent 3D Gaussian Splatting techniques. Our framework \textbf{DroidSplat} achieves both SotA tracking and rendering results on common SLAM benchmarks. We implemented multiple building blocks of modern SLAM systems to run in parallel, allowing for fast inference on common consumer GPU's. Recent progress in monocular depth prediction and camera calibration allows our system to achieve strong results even on in-the-wild data without known camera intrinsics. Code will be available at \url{https://github.com/ChenHoy/DROID-Splat}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes