CVCLNov 26, 2025

AnchorOPT: Towards Optimizing Dynamic Anchors for Adaptive Prompt Learning

arXiv:2511.21188v11 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in prompt learning for vision-language models, offering a plug-and-play module for improved adaptability, though it is incremental as it builds on existing anchor-based methods.

The paper tackles the lack of cross-task and stage-adaptive flexibility in static anchors used in prompt learning for CLIP models by proposing AnchorOPT, a dynamic anchor-based framework that learns anchor values from task-specific data and optimizes positional relationships, achieving performance comparable to or exceeding methods with additional modules across diverse datasets.

Existing prompt learning methods, which are built upon CLIP models, leverage textual tokens as anchors to guide the learnable soft tokens. This guidance improves CLIP generalizations. However, these anchors-static in both value and position-lack cross-task and stage-adaptive flexibility. To address this limitation, we propose AnchorOPT, a dynamic anchor-based prompt learning framework. Specifically, AnchorOPT introduces dynamism in two key dimensions: (i) anchor values eschew handcrafted explicit textual tokens (e.g., "shape", "color"), instead learning dynamically from task-specific data; and (ii) the positional relationship between anchor and soft tokens is no longer fixed but adaptively optimized via a learnable position matrix conditioned on the training stage and task context. Training occurs in two stages: we first learn the anchor tokens, then freeze and transfer them to the second stage for optimization of soft tokens and the position matrix. Extensive experiments demonstrate that using only a simple learnable anchor and position matrix achieves performance comparable to or exceeding some methods incorporating additional learnable modules or regularization techniques. As a plug-and-play module, AnchorOPT integrates seamlessly into existing frameworks, yielding consistent performance gains across diverse datasets. Code is publicly available at https://github.com/zhengli97/ATPrompt.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes