SAGE: SLAM with Appearance and Geometry Prior for Endoscopy
This work addresses the problem of surgical navigation in endoscopy by providing a robust SLAM solution, though it is incremental as it builds on existing SLAM methods with new priors.
The authors tackled real-time endoscope tracking and dense 3D reconstruction from monocular endoscopic video by developing SAGE, a SLAM system that combines learned appearance and geometry priors with factor graph optimization. The system robustly handles texture scarceness and illumination variation, generalizes to unseen endoscopes and subjects, and performs favorably compared to a state-of-the-art feature-based SLAM system.
In endoscopy, many applications (e.g., surgical navigation) would benefit from a real-time method that can simultaneously track the endoscope and reconstruct the dense 3D geometry of the observed anatomy from a monocular endoscopic video. To this end, we develop a Simultaneous Localization and Mapping system by combining the learning-based appearance and optimizable geometry priors and factor graph optimization. The appearance and geometry priors are explicitly learned in an end-to-end differentiable training pipeline to master the task of pair-wise image alignment, one of the core components of the SLAM system. In our experiments, the proposed SLAM system is shown to robustly handle the challenges of texture scarceness and illumination variation that are commonly seen in endoscopy. The system generalizes well to unseen endoscopes and subjects and performs favorably compared with a state-of-the-art feature-based SLAM system. The code repository is available at https://github.com/lppllppl920/SAGE-SLAM.git.