CVSep 5, 2023

Doppelgangers: Learning to Disambiguate Images of Similar Structures

Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely

DeepMind

arXiv:2309.02420v116.434 citationsh-index: 73Has Code

Originality Incremental advance

AI Analysis

This addresses a challenge in 3D reconstruction for computer vision, where illusory matches can cause errors, but it is incremental as it builds on existing SfM methods with a new dataset and network.

The paper tackles the problem of distinguishing whether two visually similar images show the same or different 3D surfaces, such as symmetric buildings, and proposes a learning-based binary classification method that integrates into SfM pipelines to produce correct 3D reconstructions.

We consider the visual disambiguation task of determining whether a pair of visually similar images depict the same or distinct 3D surfaces (e.g., the same or opposite sides of a symmetric building). Illusory image matches, where two images observe distinct but visually similar 3D surfaces, can be challenging for humans to differentiate, and can also lead 3D reconstruction algorithms to produce erroneous results. We propose a learning-based approach to visual disambiguation, formulating it as a binary classification task on image pairs. To that end, we introduce a new dataset for this problem, Doppelgangers, which includes image pairs of similar structures with ground truth labels. We also design a network architecture that takes the spatial distribution of local keypoints and matches as input, allowing for better reasoning about both local and global cues. Our evaluation shows that our method can distinguish illusory matches in difficult cases, and can be integrated into SfM pipelines to produce correct, disambiguated 3D reconstructions. See our project page for our code, datasets, and more results: http://doppelgangers-3d.github.io/.

View on arXiv PDF Code

Similar