Similarity-Aware Selective State-Space Modeling for Semantic Correspondence
This addresses a fundamental challenge in computer vision for tasks like image matching, but it appears incremental as it builds on existing correlation-metric approaches with efficiency improvements.
The paper tackled the problem of establishing semantic correspondences between images by introducing MambaMatcher, which efficiently models high-dimensional correlations using selective state-space models, achieving state-of-the-art performance on standard benchmarks.
Establishing semantic correspondences between images is a fundamental yet challenging task in computer vision. Traditional feature-metric methods enhance visual features but may miss complex inter-correlation relationships, while recent correlation-metric approaches are hindered by high computational costs due to processing 4D correlation maps. We introduce MambaMatcher, a novel method that overcomes these limitations by efficiently modeling high-dimensional correlations using selective state-space models (SSMs). By implementing a similarity-aware selective scan mechanism adapted from Mamba's linear-complexity algorithm, MambaMatcher refines the 4D correlation map effectively without compromising feature map resolution or receptive field. Experiments on standard semantic correspondence benchmarks demonstrate that MambaMatcher achieves state-of-the-art performance.