MinCD-PnP: Learning 2D-3D Correspondences with Approximate Blind PnP
This work improves registration accuracy for computer vision applications like robotics and AR, but it is incremental as it builds on existing I2P registration architectures.
The paper tackles the problem of image-to-point-cloud registration by addressing the sensitivity of differential PnP to noise and outliers in 2D-3D correspondences, proposing MinCD-PnP which simplifies blind PnP to minimize Chamfer distance, resulting in higher inlier ratio and registration recall across multiple datasets.
Image-to-point-cloud (I2P) registration is a fundamental problem in computer vision, focusing on establishing 2D-3D correspondences between an image and a point cloud. The differential perspective-n-point (PnP) has been widely used to supervise I2P registration networks by enforcing the projective constraints on 2D-3D correspondences. However, differential PnP is highly sensitive to noise and outliers in the predicted correspondences. This issue hinders the effectiveness of correspondence learning. Inspired by the robustness of blind PnP against noise and outliers in correspondences, we propose an approximated blind PnP based correspondence learning approach. To mitigate the high computational cost of blind PnP, we simplify blind PnP to an amenable task of minimizing Chamfer distance between learned 2D and 3D keypoints, called MinCD-PnP. To effectively solve MinCD-PnP, we design a lightweight multi-task learning module, named as MinCD-Net, which can be easily integrated into the existing I2P registration architectures. Extensive experiments on 7-Scenes, RGBD-V2, ScanNet, and self-collected datasets demonstrate that MinCD-Net outperforms state-of-the-art methods and achieves a higher inlier ratio (IR) and registration recall (RR) in both cross-scene and cross-dataset settings.