Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction
This work addresses robust pose estimation and mapping for monocular visual SLAM systems, representing an incremental improvement over existing methods.
The paper tackles the challenge of robust monocular visual SLAM by proposing a system that uses graph-cut optimization to jointly solve homography estimation and piece-wise planar reconstruction, addressing issues with scale ambiguity and inaccurate CNN outputs. The method demonstrates improved performance with comprehensive evaluations on open-source datasets.
This paper presents a semantic planar SLAM system that improves pose estimation and mapping using cues from an instance planar segmentation network. While the mainstream approaches are using RGB-D sensors, employing a monocular camera with such a system still faces challenges such as robust data association and precise geometric model fitting. In the majority of existing work, geometric model estimation problems such as homography estimation and piece-wise planar reconstruction (PPR) are usually solved by standard (greedy) RANSAC separately and sequentially. However, setting the inlier-outlier threshold is difficult in absence of information about the scene (i.e. the scale). In this work, we revisit these problems and argue that two mentioned geometric models (homographies/3D planes) can be solved by minimizing an energy function that exploits the spatial coherence, i.e. with graph-cut optimization, which also tackles the practical issue when the output of a trained CNN is inaccurate. Moreover, we propose an adaptive parameter setting strategy based on our experiments, and report a comprehensive evaluation on various open-source datasets.