CVMay 14, 2024

TP3M: Transformer-based Pseudo 3D Image Matching with Reference Image

arXiv:2405.08434v22 citationsh-index: 5ICRA
Originality Incremental advance
AI Analysis

This addresses image matching problems in computer vision for applications like robotics and AR, but appears incremental as it builds on existing Transformer and 3D matching ideas.

The paper tackles the challenge of image matching in scenes with large viewpoint or illumination changes by proposing a Transformer-based pseudo 3D method that uses a reference image to upgrade 2D features to 3D, achieving state-of-the-art results on homography estimation, pose estimation, and visual localization tasks.

Image matching is still challenging in such scenes with large viewpoints or illumination changes or with low textures. In this paper, we propose a Transformer-based pseudo 3D image matching method. It upgrades the 2D features extracted from the source image to 3D features with the help of a reference image and matches to the 2D features extracted from the destination image by the coarse-to-fine 3D matching. Our key discovery is that by introducing the reference image, the source image's fine points are screened and furtherly their feature descriptors are enriched from 2D to 3D, which improves the match performance with the destination image. Experimental results on multiple datasets show that the proposed method achieves the state-of-the-art on the tasks of homography estimation, pose estimation and visual localization especially in challenging scenes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes