SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning
This work addresses the need for more generalizable parameter-efficient fine-tuning in multitask learning for NLP, though it appears incremental as it builds on existing prompt tuning methods.
The paper tackles the problem of poor generalization in existing prompt tuning methods for multitask learning by proposing SPT, a semi-parametric prompt tuning method that uses a memory bank to retrieve memory prompts based on discrete prompts, resulting in demonstrated effectiveness across 31 tasks from 8 domains and zero-shot generalization on 9 heldout datasets.
Pre-trained large language models can efficiently interpolate human-written prompts in a natural way. Multitask prompted learning can help generalization through a diverse set of tasks at once, thus enhancing the potential for more effective downstream fine-tuning. To perform efficient multitask-inference in the same batch, parameter-efficient fine-tuning methods such as prompt tuning have been proposed. However, the existing prompt tuning methods may lack generalization. We propose SPT, a semi-parametric prompt tuning method for multitask prompted learning. The novel component of SPT is a memory bank from where memory prompts are retrieved based on discrete prompts. Extensive experiments, such as (i) fine-tuning a full language model with SPT on 31 different tasks from 8 different domains and evaluating zero-shot generalization on 9 heldout datasets under 5 NLP task categories and (ii) pretraining SPT on the GLUE datasets and evaluating fine-tuning on the SuperGLUE datasets, demonstrate effectiveness of SPT.