CLMay 24, 2023

Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning

arXiv:2305.15212v1230 citations
AI Analysis

This work addresses parameter-efficient fine-tuning for language models, offering an incremental improvement over existing prefix tuning methods.

The authors tackled the problem of expensive fine-tuning of large language models by proposing Adaptive Prefix Tuning (APT), which adjusts prefix vectors at token and layer levels using a gate mechanism, resulting in improved effectiveness on SuperGLUE and NER datasets.

Fine-tuning large pre-trained language models on various downstream tasks with whole parameters is prohibitively expensive. Hence, Parameter-efficient fine-tuning has attracted attention that only optimizes a few task-specific parameters with the frozen pre-trained model. In this work, we focus on prefix tuning, which only optimizes continuous prefix vectors (i.e. pseudo tokens) inserted into Transformer layers. Based on the observation that the learned syntax and semantics representation varies a lot at different layers, we argue that the adaptive prefix will be further tailored to each layer than the fixed one, enabling the fine-tuning more effective and efficient. Thus, we propose Adaptive Prefix Tuning (APT) to adjust the prefix in terms of both fine-grained token level and coarse-grained layer level with a gate mechanism. Experiments on the SuperGLUE and NER datasets show the effectiveness of APT. In addition, taking the gate as a probing, we validate the efficiency and effectiveness of the variable prefix.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes