A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation
This addresses registration failures in robotics and autonomous driving due to low overlap, but it is incremental as it builds on existing BEV and attention-based methods.
The paper tackles pairwise point cloud registration by jointly learning 3D local features and overlap estimation to handle low-overlap scenes, achieving top registration performance on KITTI and Apollo-SouthBay datasets with reduced translation and rotation errors.
Pairwise point cloud registration is a critical task for many applications, which heavily depends on finding correct correspondences from the two point clouds. However, the low overlap between input point clouds causes the registration to fail easily, leading to mistaken overlapping and mismatched correspondences, especially in scenes where non-overlapping regions contain similar structures. In this paper, we present a unified bird's-eye view (BEV) model for jointly learning of 3D local features and overlap estimation to fulfill pairwise registration and loop closure. Feature description is performed by a sparse UNet-like network based on BEV representation, and 3D keypoints are extracted by a detection head for 2D locations, and a regression head for heights. For overlap detection, a cross-attention module is applied for interacting contextual information of input point clouds, followed by a classification head to estimate the overlapping region. We evaluate our unified model extensively on the KITTI dataset and Apollo-SouthBay dataset. The experiments demonstrate that our method significantly outperforms existing methods on overlap estimation, especially in scenes with small overlaps. It also achieves top registration performance on both datasets in terms of translation and rotation errors.