CVFeb 28, 2024

Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport

arXiv:2402.18411v416 citationsh-index: 16AAAI
Originality Highly original
AI Analysis

This addresses the problem of retrieving images across domains without labeled data for computer vision applications, representing a novel integration rather than an incremental improvement.

The paper tackles unsupervised cross-domain image retrieval by introducing ProtoOT, a unified Optimal Transport framework that integrates intra-domain representation learning and cross-domain alignment, achieving state-of-the-art results with an 18.17% average P@200 improvement on DomainNet and 3.83% P@15 improvement on Office-Home.

Unsupervised cross-domain image retrieval (UCIR) aims to retrieve images sharing the same category across diverse domains without relying on labeled data. Prior approaches have typically decomposed the UCIR problem into two distinct tasks: intra-domain representation learning and cross-domain feature alignment. However, these segregated strategies overlook the potential synergies between these tasks. This paper introduces ProtoOT, a novel Optimal Transport formulation explicitly tailored for UCIR, which integrates intra-domain feature representation learning and cross-domain alignment into a unified framework. ProtoOT leverages the strengths of the K-means clustering method to effectively manage distribution imbalances inherent in UCIR. By utilizing K-means for generating initial prototypes and approximating class marginal distributions, we modify the constraints in Optimal Transport accordingly, significantly enhancing its performance in UCIR scenarios. Furthermore, we incorporate contrastive learning into the ProtoOT framework to further improve representation learning. This encourages local semantic consistency among features with similar semantics, while also explicitly enforcing separation between features and unmatched prototypes, thereby enhancing global discriminativeness. ProtoOT surpasses existing state-of-the-art methods by a notable margin across benchmark datasets. Notably, on DomainNet, ProtoOT achieves an average P@200 enhancement of 18.17%, and on Office-Home, it demonstrates a P@15 improvement of 3.83%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes