CVLGSep 4, 2025

DUDE: Diffusion-Based Unsupervised Cross-Domain Image Retrieval

arXiv:2509.04193v1h-index: 13
Originality Incremental advance
AI Analysis

This addresses the challenge of retrieving images across diverse domains without annotations, which is incremental as it builds on feature disentanglement methods.

The paper tackles the problem of unsupervised cross-domain image retrieval by proposing DUDE, which disentangles object features from domain-specific styles using a text-to-image generative model and aligns features progressively, achieving state-of-the-art performance across three benchmark datasets over 13 domains.

Unsupervised cross-domain image retrieval (UCIR) aims to retrieve images of the same category across diverse domains without relying on annotations. Existing UCIR methods, which align cross-domain features for the entire image, often struggle with the domain gap, as the object features critical for retrieval are frequently entangled with domain-specific styles. To address this challenge, we propose DUDE, a novel UCIR method building upon feature disentanglement. In brief, DUDE leverages a text-to-image generative model to disentangle object features from domain-specific styles, thus facilitating semantical image retrieval. To further achieve reliable alignment of the disentangled object features, DUDE aligns mutual neighbors from within domains to across domains in a progressive manner. Extensive experiments demonstrate that DUDE achieves state-of-the-art performance across three benchmark datasets over 13 domains. The code will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes