CVSep 18, 2025

CoDoL: Conditional Domain Prompt Learning for Out-of-Distribution Generalization

arXiv:2509.15330v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the problem of degraded accuracy and robustness in zero-shot CLIP methods for researchers and practitioners in computer vision, representing an incremental improvement.

The paper tackles inaccurate text descriptions and limited vision-language alignment in prompt-based CLIP methods for out-of-distribution generalization by proposing CoDoL, which uses domain information and a Domain Meta Network to improve alignment and achieves validated effectiveness on four OOD benchmarks.

Recent advances in pre-training vision-language models (VLMs), e.g., contrastive language-image pre-training (CLIP) methods, have shown great potential in learning out-of-distribution (OOD) representations. Despite showing competitive performance, the prompt-based CLIP methods still suffer from: i) inaccurate text descriptions, which leads to degraded accuracy and robustness, and poses a challenge for zero-shot CLIP methods. ii) limited vision-language embedding alignment, which significantly affects the generalization performance. To tackle the above issues, this paper proposes a novel Conditional Domain prompt Learning (CoDoL) method, which utilizes readily-available domain information to form prompts and improves the vision-language embedding alignment for improving OOD generalization. To capture both instance-specific and domain-specific information, we further propose a lightweight Domain Meta Network (DMN) to generate input-conditional tokens for images in each domain. Extensive experiments on four OOD benchmarks (PACS, VLCS, OfficeHome and DigitDG) validate the effectiveness of our proposed CoDoL in terms of improving the vision-language embedding alignment as well as the out-of-distribution generalization performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes