CVApr 17, 2023

Progressive Visual Prompt Learning with Contrastive Feature Re-formation

arXiv:2304.08386v361 citationsh-index: 65
Originality Incremental advance
AI Analysis

This work addresses the challenge of visual prompt learning for vision-language models, offering a novel method that improves adaptation and generalization, though it is incremental in advancing prompt-based techniques.

The paper tackles the problem of adapting vision-language models to downstream tasks using visual prompts, which previously suffered from mediocre performance or unstable training, and achieves state-of-the-art results on 7 out of 11 image benchmark datasets in few-shot and base-to-novel settings.

Prompt learning has been designed as an alternative to fine-tuning for adapting Vision-language (V-L) models to the downstream tasks. Previous works mainly focus on text prompt while visual prompt works are limited for V-L models. The existing visual prompt methods endure either mediocre performance or unstable training process, indicating the difficulty of visual prompt learning. In this paper, we propose a new Progressive Visual Prompt (ProVP) structure to strengthen the interactions among prompts of different layers. More importantly, our ProVP could effectively propagate the image embeddings to deep layers and behave partially similar to an instance adaptive prompt method. To alleviate generalization deterioration, we further propose a new contrastive feature re-formation, which prevents the serious deviation of the prompted visual feature from the fixed CLIP visual feature distribution. Combining both, our method (ProVP-Ref) is evaluated on 11 image benchmark datasets and achieves 7/11 state-of-theart results on both few-shot and base-to-novel settings. To the best of our knowledge, we are the first to demonstrate the superior performance of visual prompts in V-L models to previous prompt-based methods in downstream tasks. Meanwhile, it implies that our ProVP-Ref shows the best capability to adapt and to generalize.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes