Adversarial Robustness of Prompt-based Few-Shot Learning for Natural Language Understanding
This work addresses the adversarial robustness problem for researchers and practitioners using few-shot learning in NLP, though it is incremental as it evaluates existing methods rather than proposing new ones.
The study assessed the adversarial robustness of prompt-based few-shot learning methods for natural language understanding, finding that vanilla methods are less robust than fully fine-tuned models but that using unlabeled data and multiple prompts improves robustness, with increases in few-shot examples and model size also enhancing it.
State-of-the-art few-shot learning (FSL) methods leverage prompt-based fine-tuning to obtain remarkable results for natural language understanding (NLU) tasks. While much of the prior FSL methods focus on improving downstream task performance, there is a limited understanding of the adversarial robustness of such methods. In this work, we conduct an extensive study of several state-of-the-art FSL methods to assess their robustness to adversarial perturbations. To better understand the impact of various factors towards robustness (or the lack of it), we evaluate prompt-based FSL methods against fully fine-tuned models for aspects such as the use of unlabeled data, multiple prompts, number of few-shot examples, model size and type. Our results on six GLUE tasks indicate that compared to fully fine-tuned models, vanilla FSL methods lead to a notable relative drop in task performance (i.e., are less robust) in the face of adversarial perturbations. However, using (i) unlabeled data for prompt-based FSL and (ii) multiple prompts flip the trend. We further demonstrate that increasing the number of few-shot examples and model size lead to increased adversarial robustness of vanilla FSL methods. Broadly, our work sheds light on the adversarial robustness evaluation of prompt-based FSL methods for NLU tasks.