CVJan 23, 2024

EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction

arXiv:2401.12561v231 citationsh-index: 14Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for real-time reconstruction of deformable tissues in endoscopic surgery, offering significant improvements for intraoperative applications, though it is incremental as it builds on existing 3DGS methods.

The paper tackles the problem of slow rendering speed in endoscopic scene reconstruction by introducing EndoGaussian, a framework based on 3D Gaussian Splatting, which achieves real-time rendering at 195 FPS, a 100× speed gain, with high quality (37.848 PSNR) and low training overhead (within 2 minutes per scene).

Reconstructing deformable tissues from endoscopic videos is essential in many downstream surgical applications. However, existing methods suffer from slow rendering speed, greatly limiting their practical use. In this paper, we introduce EndoGaussian, a real-time endoscopic scene reconstruction framework built on 3D Gaussian Splatting (3DGS). By integrating the efficient Gaussian representation and highly-optimized rendering engine, our framework significantly boosts the rendering speed to a real-time level. To adapt 3DGS for endoscopic scenes, we propose two strategies, Holistic Gaussian Initialization (HGI) and Spatio-temporal Gaussian Tracking (SGT), to handle the non-trivial Gaussian initialization and tissue deformation problems, respectively. In HGI, we leverage recent depth estimation models to predict depth maps of input binocular/monocular image sequences, based on which pixels are re-projected and combined for holistic initialization. In SPT, we propose to model surface dynamics using a deformation field, which is composed of an efficient encoding voxel and a lightweight deformation decoder, allowing for Gaussian tracking with minor training and rendering burden. Experiments on public datasets demonstrate our efficacy against prior SOTAs in many aspects, including better rendering speed (195 FPS real-time, 100$\times$ gain), better rendering quality (37.848 PSNR), and less training overhead (within 2 min/scene), showing significant promise for intraoperative surgery applications. Code is available at: \url{https://yifliu3.github.io/EndoGaussian/}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes