CVLGROMar 30, 2021

Endo-Depth-and-Motion: Reconstruction and Tracking in Endoscopic Videos using Depth Networks and Photometric Constraints

arXiv:2103.16525v2165 citations
AI Analysis

This work addresses the problem of scene reconstruction and tracking in endoscopic videos for medical applications, representing an incremental improvement by combining existing depth networks and photometric constraints.

The paper tackles the challenge of estimating camera pose and dense 3D scene models from monocular endoscopic videos, which is difficult due to factors like deformation and lack of texture, and presents Endo-Depth-and-Motion, a pipeline that achieves high-quality results as shown in experiments on the Hamlyn dataset.

Estimating a scene reconstruction and the camera motion from in-body videos is challenging due to several factors, e.g. the deformation of in-body cavities or the lack of texture. In this paper we present Endo-Depth-and-Motion, a pipeline that estimates the 6-degrees-of-freedom camera pose and dense 3D scene models from monocular endoscopic videos. Our approach leverages recent advances in self-supervised depth networks to generate pseudo-RGBD frames, then tracks the camera pose using photometric residuals and fuses the registered depth maps in a volumetric representation. We present an extensive experimental evaluation in the public dataset Hamlyn, showing high-quality results and comparisons against relevant baselines. We also release all models and code for future comparisons.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes