CL AIMay 23, 2022

Prompt Tuning for Discriminative Pre-trained Language Models

Yuan Yao, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Leyu Lin, Maosong Sun, Jianyong Wang

arXiv:2205.11166v132.2651 citationsh-index: 98Has Code

Originality Highly original

AI Analysis

This work addresses a gap in prompt tuning for discriminative PLMs, offering a novel method that improves performance and stability, particularly for researchers and practitioners in NLP dealing with large models and low-resource settings.

The paper tackles the problem of prompt tuning for discriminative pre-trained language models (PLMs) like ELECTRA, which had not been explored before, and presents DPT, a framework that reformulates NLP tasks into a discriminative language modeling problem, achieving significantly higher performance compared to vanilla fine-tuning in text classification and question answering tasks.

Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks. However, to the best of our knowledge, existing works focus on prompt-tuning generative PLMs that are pre-trained to generate target tokens, such as BERT. It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned. In this work, we present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem. Comprehensive experiments on text classification and question answering show that, compared with vanilla fine-tuning, DPT achieves significantly higher performance, and also prevents the unstable problem in tuning large PLMs in both full-set and low-resource settings. The source code and experiment details of this paper can be obtained from https://github.com/thunlp/DPT.

View on arXiv PDF Code

Similar