PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization
This work addresses a critical issue in robotics for applications requiring robust and accurate dense reconstruction under large viewpoint changes or fast motions, representing an incremental improvement over existing methods.
The paper tackles the problem of real-time dense scene reconstruction during unstable camera motions by combining learning-based initialization with optimization-based refinement, achieving improved performance on challenging benchmarks while maintaining real-time operation.
Real-time dense scene reconstruction during unstable camera motions is crucial for robotics, yet current RGB-D SLAM systems fail when cameras experience large viewpoint changes, fast motions, or sudden shaking. Classical optimization-based methods deliver high accuracy but fail with poor initialization during large motions, while learning-based approaches provide robustness but lack sufficient accuracy for dense reconstruction. We address this challenge through a combination of learning-based initialization with optimization-based refinement. Our method employs a camera pose regression network to predict metric-aware relative poses from consecutive RGB-D frames, which serve as reliable starting points for a randomized optimization algorithm that further aligns depth images with the scene geometry. Extensive experiments demonstrate promising results: our approach outperforms the best competitor on challenging benchmarks, while maintaining comparable accuracy on stable motion sequences. The system operates in real-time, showcasing that combining simple and principled techniques can achieve both robustness for unstable motions and accuracy for dense reconstruction. Project page: https://github.com/siyandong/PROFusion.