ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation
This work addresses a bottleneck in visual measurement tasks for computer vision applications, offering an incremental improvement over existing methods.
The paper tackled the lack of geometric invariance in deep learning-based keypoint and descriptor extraction by proposing a Sparse Deformable Descriptor Head (SDDH) that learns deformable positions for features and constructs sparse descriptors, resulting in efficient and powerful performance in tasks like image matching and 3D reconstruction.
Image keypoints and descriptors play a crucial role in many visual measurement tasks. In recent years, deep neural networks have been widely used to improve the performance of keypoint and descriptor extraction. However, the conventional convolution operations do not provide the geometric invariance required for the descriptor. To address this issue, we propose the Sparse Deformable Descriptor Head (SDDH), which learns the deformable positions of supporting features for each keypoint and constructs deformable descriptors. Furthermore, SDDH extracts descriptors at sparse keypoints instead of a dense descriptor map, which enables efficient extraction of descriptors with strong expressiveness. In addition, we relax the neural reprojection error (NRE) loss from dense to sparse to train the extracted sparse descriptors. Experimental results show that the proposed network is both efficient and powerful in various visual measurement tasks, including image matching, 3D reconstruction, and visual relocalization.