LGCVJun 13, 2024

Enhancing Domain Adaptation through Prompt Gradient Alignment

arXiv:2406.09353v313 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses domain adaptation challenges in computer vision, offering an incremental improvement by refining gradient alignment techniques for better feature learning.

The paper tackles the problem of Unsupervised Domain Adaptation (UDA) by framing it as a multiple-objective optimization problem and aligning per-objective gradients to foster consensus, resulting in consistent empirical outperformance over other vision-language model adaptation methods.

Prior Unsupervised Domain Adaptation (UDA) methods often aim to train a domain-invariant feature extractor, which may hinder the model from learning sufficiently discriminative features. To tackle this, a line of works based on prompt learning leverages the power of large-scale pre-trained vision-language models to learn both domain-invariant and specific features through a set of domain-agnostic and domain-specific learnable prompts. Those studies typically enforce invariant constraints on representation, output, or prompt space to learn such prompts. In contrast, we cast UDA as a multiple-objective optimization problem in which each objective is represented by a domain loss. Under this new framework, we propose to align per-objective gradients to foster consensus between them. Additionally, to prevent potential overfitting when fine-tuning this deep learning architecture, we penalize the norm of these gradients. To achieve these goals, we devise a practical gradient update procedure that can work under both single-source and multi-source UDA. Empirically, our method consistently outperforms other vision-language model adaptation methods. The implementation is available at https://github.com/VietHoang1512/PGA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes