CVNov 28, 2023

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

arXiv:2311.17034v298 citationsh-index: 33
Originality Incremental advance
AI Analysis

It addresses the limitation of current foundation models in grasping geometry for semantic correspondence, offering improvements in both zero-shot and supervised settings, though it is incremental as it builds on existing models with simple post-processing.

The paper tackles the problem of geometry-aware semantic correspondence in vision models, showing that incorporating geometry and orientation information significantly improves performance, achieving PCK@0.10 scores of 65.4 (zero-shot) and 85.6 (supervised) on SPair-71k with absolute gains of 5.5 and 11.0 points over state-of-the-art.

While pre-trained large-scale vision models have shown significant promise for semantic correspondence, their features often struggle to grasp the geometry and orientation of instances. This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing. We show that incorporating this information can markedly enhance semantic correspondence performance with simple but effective solutions in both zero-shot and supervised settings. We also construct a new challenging benchmark for semantic correspondence built from an existing animal pose estimation dataset, for both pre-training validating models. Our method achieves a PCK@0.10 score of 65.4 (zero-shot) and 85.6 (supervised) on the challenging SPair-71k dataset, outperforming the state of the art by 5.5p and 11.0p absolute gains, respectively. Our code and datasets are publicly available at: https://telling-left-from-right.github.io/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes