CVMar 18, 2025

3D Densification for Multi-Map Monocular VSLAM in Endoscopy

arXiv:2503.14346v21 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the need for dense and accurate 3D maps in clinical endoscopic applications, representing an incremental improvement over existing sparse methods.

The paper tackles the problem of poor environment representation in sparse multi-map monocular VSLAM for endoscopy by proposing a method to remove outliers and densify maps, achieving 4.15 mm RMS accuracy in phantom colon datasets.

Multi-map Sparse Monocular visual Simultaneous Localization and Mapping applied to monocular endoscopic sequences has proven efficient to robustly recover tracking after the frequent losses in endoscopy due to motion blur, temporal occlusion, tools interaction or water jets. The sparse multi-maps are adequate for robust camera localization, however they are very poor for environment representation, they are noisy, with a high percentage of inaccurately reconstructed 3D points, including significant outliers, and more importantly with an unacceptable low density for clinical applications. We propose a method to remove outliers and densify the maps of the state of the art for sparse endoscopy multi-map CudaSIFT-SLAM. The NN LightDepth for up-to-scale depth dense predictions are aligned with the sparse CudaSIFT submaps by means of the robust to spurious LMedS. Our system mitigates the inherent scale ambiguity in monocular depth estimation while filtering outliers, leading to reliable densified 3D maps. We provide experimental evidence of accurate densified maps 4.15 mm RMS accuracy at affordable computing time in the C3VD phantom colon dataset. We report qualitative results on the real colonoscopy from the Endomapper dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes