CLJan 1, 2021

Prefix-Tuning: Optimizing Continuous Prompts for Generation

arXiv:2101.00190v15876 citations
Originality Highly original
AI Analysis

This method addresses the problem of efficiently adapting large language models for various natural language generation tasks, particularly benefiting users with limited computational resources or data.

This paper proposes prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks. It optimizes a small continuous task-specific vector (prefix) while keeping language model parameters frozen, achieving comparable performance to fine-tuning with only 0.1% of parameters, outperforming it in low-data settings, and showing better extrapolation to unseen topics.

Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. However, it modifies all the language model parameters and therefore necessitates storing a full copy for each task. In this paper, we propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks, which keeps language model parameters frozen, but optimizes a small continuous task-specific vector (called the prefix). Prefix-tuning draws inspiration from prompting, allowing subsequent tokens to attend to this prefix as if it were "virtual tokens". We apply prefix-tuning to GPT-2 for table-to-text generation and to BART for summarization. We find that by learning only 0.1\% of the parameters, prefix-tuning obtains comparable performance in the full data setting, outperforms fine-tuning in low-data settings, and extrapolates better to examples with topics unseen during training.

Code Implementations13 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes