CL AI CV IR LGAug 30, 2021

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Ningyu Zhang, Luoqiu Li, Xiang Chen, Shumin Deng, Zhen Bi, Chuanqi Tan, Fei Huang, Huajun Chen

arXiv:2108.13161v711.7205 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of inefficient few-shot learning in NLP for real-world applications by providing a pluggable and extensible solution, though it is incremental as it builds on existing prompt-based methods.

The study tackled the challenge of making pre-trained language models effective few-shot learners without relying on scaling or manual prompt engineering by proposing DART, a differentiable prompt optimization method that reformulates tasks and optimizes prompts via backpropagation, achieving improved few-shot performance on standard NLP tasks.

Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance. Code is available in https://github.com/zjunlp/DART.

View on arXiv PDF Code

Similar