CLAINov 13, 2022

FPT: Improving Prompt Tuning Efficiency via Progressive Training

arXiv:2211.06840v1290 citationsh-index: 98
Originality Incremental advance
AI Analysis

This addresses training inefficiency for users of parameter-efficient fine-tuning methods, though it is incremental as it builds on existing prompt tuning techniques.

The paper tackles the slow convergence of prompt tuning in pre-trained language models by proposing Fast Prompt Tuning (FPT), which uses progressive training with partial models to recycle soft prompts, saving over 30% training computations while maintaining comparable performance.

Recently, prompt tuning (PT) has gained increasing attention as a parameter-efficient way of tuning pre-trained language models (PLMs). Despite extensively reducing the number of tunable parameters and achieving satisfying performance, PT is training-inefficient due to its slow convergence. To improve PT's training efficiency, we first make some novel observations about the prompt transferability of "partial PLMs", which are defined by compressing a PLM in depth or width. We observe that the soft prompts learned by different partial PLMs of various sizes are similar in the parameter space, implying that these soft prompts could potentially be transferred among partial PLMs. Inspired by these observations, we propose Fast Prompt Tuning (FPT), which starts by conducting PT using a small-scale partial PLM, and then progressively expands its depth and width until the full-model size. After each expansion, we recycle the previously learned soft prompts as initialization for the enlarged partial PLM and then proceed PT. We demonstrate the feasibility of FPT on 5 tasks and show that FPT could save over 30% training computations while achieving comparable performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes