ROCVMar 4, 2020

Voxel Map for Visual SLAM

arXiv:2003.02247v125 citations
AI Analysis

This addresses a fundamental functionality in visual SLAM for robotics and AR/VR applications, offering a scalable solution for large scenes and multicamera setups, though it is incremental as it builds on existing SLAM systems.

The paper tackles the inefficiency and geometric limitations of using keyframes for map point retrieval in visual SLAM by proposing a voxel-map representation, which improves localization accuracy by an average of 46% in RMSE on the EuRoC dataset while maintaining efficiency comparable to a keyframe map with 5 keyframes.

In modern visual SLAM systems, it is a standard practice to retrieve potential candidate map points from overlapping keyframes for further feature matching or direct tracking. In this work, we argue that keyframes are not the optimal choice for this task, due to several inherent limitations, such as weak geometric reasoning and poor scalability. We propose a voxel-map representation to efficiently retrieve map points for visual SLAM. In particular, we organize the map points in a regular voxel grid. Visible points from a camera pose are queried by sampling the camera frustum in a raycasting manner, which can be done in constant time using an efficient voxel hashing method. Compared with keyframes, the retrieved points using our method are geometrically guaranteed to fall in the camera field-of-view, and occluded points can be identified and removed to a certain extend. This method also naturally scales up to large scenes and complicated multicamera configurations. Experimental results show that our voxel map representation is as efficient as a keyframe map with 5 keyframes and provides significantly higher localization accuracy (average 46% improvement in RMSE) on the EuRoC dataset. The proposed voxel-map representation is a general approach to a fundamental functionality in visual SLAM and widely applicable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes