Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence
This addresses the challenge of semantic correspondence for computer vision applications, but it is incremental as it builds on existing methods by integrating them more effectively.
The paper tackles the problem of establishing dense correspondences across semantically similar images by proposing a joint learning framework for feature extraction and cost aggregation, achieving competitive results on standard benchmarks.
Establishing dense correspondences across semantically similar images is one of the challenging tasks due to the significant intra-class variations and background clutters. To solve these problems, numerous methods have been proposed, focused on learning feature extractor or cost aggregation independently, which yields sub-optimal performance. In this paper, we propose a novel framework for jointly learning feature extraction and cost aggregation for semantic correspondence. By exploiting the pseudo labels from each module, the networks consisting of feature extraction and cost aggregation modules are simultaneously learned in a boosting fashion. Moreover, to ignore unreliable pseudo labels, we present a confidence-aware contrastive loss function for learning the networks in a weakly-supervised manner. We demonstrate our competitive results on standard benchmarks for semantic correspondence.