CVJul 25, 2025

DINO-SLAM: DINO-informed RGB-D SLAM for Neural Implicit and Explicit Representations

arXiv:2507.19474v12 citationsh-index: 43
Originality Incremental advance
AI Analysis

This work addresses the need for more comprehensive scene representations in SLAM for robotics and AR/VR applications, though it appears incremental as it builds on existing DINO and SLAM methods.

The paper tackled the problem of enhancing neural implicit and explicit representations in SLAM systems by integrating DINO features, achieving superior performance on benchmarks like Replica, ScanNet, and TUM compared to state-of-the-art methods.

This paper presents DINO-SLAM, a DINO-informed design strategy to enhance neural implicit (Neural Radiance Field -- NeRF) and explicit representations (3D Gaussian Splatting -- 3DGS) in SLAM systems through more comprehensive scene representations. Purposely, we rely on a Scene Structure Encoder (SSE) that enriches DINO features into Enhanced DINO ones (EDINO) to capture hierarchical scene elements and their structural relationships. Building upon it, we propose two foundational paradigms for NeRF and 3DGS SLAM systems integrating EDINO features. Our DINO-informed pipelines achieve superior performance on the Replica, ScanNet, and TUM compared to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes