CV AI MMJul 11, 2025

PanMatch: Unleashing the Potential of Large Vision Models for Unified Matching Models

Yongjian Zhang, Longguang Wang, Kunhong Li, Ye Zhang, Yun Wang, Liang Lin, Yulan Guo

arXiv:2507.08400v13.6h-index: 34

Originality Incremental advance

AI Analysis

This work addresses the need for versatile and robust matching models across multiple domains like stereo matching and optical flow, offering a unified solution that reduces the need for specialized designs, though it builds incrementally on existing displacement estimation methods.

The authors tackled the problem of creating a unified model for various two-frame correspondence matching tasks by proposing PanMatch, which uses a 2D displacement estimation framework with shared weights to eliminate task-specific architectures. The result is a model that outperforms UniMatch and Flow-Anything in cross-task evaluations, achieves comparable performance to state-of-the-art task-specific algorithms, and shows strong zero-shot capabilities in abnormal scenarios like rainy days and satellite imagery.

This work presents PanMatch, a versatile foundation model for robust correspondence matching. Unlike previous methods that rely on task-specific architectures and domain-specific fine-tuning to support tasks like stereo matching, optical flow or feature matching, our key insight is that any two-frame correspondence matching task can be addressed within a 2D displacement estimation framework using the same model weights. Such a formulation eliminates the need for designing specialized unified architectures or task-specific ensemble models. Instead, it achieves multi-task integration by endowing displacement estimation algorithms with unprecedented generalization capabilities. To this end, we highlight the importance of a robust feature extractor applicable across multiple domains and tasks, and propose the feature transformation pipeline that leverage all-purpose features from Large Vision Models to endow matching baselines with zero-shot cross-view matching capabilities. Furthermore, we assemble a cross-domain dataset with near 1.8 million samples from stereo matching, optical flow, and feature matching domains to pretrain PanMatch. We demonstrate the versatility of PanMatch across a wide range of domains and downstream tasks using the same model weights. Our model outperforms UniMatch and Flow-Anything on cross-task evaluations, and achieves comparable performance to most state-of-the-art task-specific algorithms on task-oriented benchmarks. Additionally, PanMatch presents unprecedented zero-shot performance in abnormal scenarios, such as rainy day and satellite imagery, where most existing robust algorithms fail to yield meaningful results.

View on arXiv PDF

Similar