CVMar 31, 2025

LiM-Loc: Visual Localization with Dense and Accurate 3D Reference Maps Directly Corresponding 2D Keypoints to 3D LiDAR Point Clouds

Masahiko Tsuji, Hitoshi Niigaki, Ryuichi Tanida

arXiv:2503.23664v13.6h-index: 5

Originality Highly original

AI Analysis

This addresses the challenge of sparse and inaccurate 3D maps in visual localization for robotics and autonomous systems, offering a more efficient solution by leveraging widely available LiDAR and calibrated cameras.

The paper tackles the problem of visual localization by proposing a method to generate dense and accurate 3D reference maps by directly assigning 3D LiDAR point clouds to 2D keypoints, avoiding feature matching errors. It improves camera pose estimation accuracy, as confirmed on indoor and outdoor datasets with several state-of-the-art local features.

Visual localization is to estimate the 6-DOF camera pose of a query image in a 3D reference map. We extract keypoints from the reference image and generate a 3D reference map with 3D reconstruction of the keypoints in advance. We emphasize that the more keypoints in the 3D reference map and the smaller the error of the 3D positions of the keypoints, the higher the accuracy of the camera pose estimation. However, previous image-only methods require a huge number of images, and it is difficult to 3D-reconstruct keypoints without error due to inevitable mismatches and failures in feature matching. As a result, the 3D reference map is sparse and inaccurate. In contrast, accurate 3D reference maps can be generated by combining images and 3D sensors. Recently, 3D-LiDAR has been widely used around the world. LiDAR, which measures a large space with high density, has become inexpensive. In addition, accurately calibrated cameras are also widely used, so images that record the external parameters of the camera without errors can be easily obtained. In this paper, we propose a method to directly assign 3D LiDAR point clouds to keypoints to generate dense and accurate 3D reference maps. The proposed method avoids feature matching and achieves accurate 3D reconstruction for almost all keypoints. To estimate camera pose over a wide area, we use the wide-area LiDAR point cloud to remove points that are not visible to the camera and reduce 2D-3D correspondence errors. Using indoor and outdoor datasets, we apply the proposed method to several state-of-the-art local features and confirm that it improves the accuracy of camera pose estimation.

View on arXiv PDF

Similar