Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation
This addresses the challenge of parameter-efficient adaptation for practitioners who need to use black-box LLMs without computational resources for fine-tuning, though it is incremental as it builds on existing black-box and data augmentation techniques.
The paper tackles the problem of optimizing few-shot text classification without accessing gradients of large language models by treating them as black-box feature extractors and using prompt-based data augmentation on a smaller auxiliary model. The result is that their approach, BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners and performs on par with full-model tuning methods, as shown in experiments on eight datasets.
Training or finetuning large-scale language models (LLMs) such as GPT-3 requires substantial computation resources, motivating recent efforts to explore parameter-efficient adaptation to downstream tasks. One practical area of research is to treat these models as black boxes and interact with them through their inference APIs. In this paper, we investigate how to optimize few-shot text classification without accessing the gradients of the LLMs. To achieve this, we treat the black-box model as a feature extractor and train a classifier with the augmented text data. Data augmentation is performed using prompt-based finetuning on an auxiliary language model with a much smaller parameter size than the black-box model. Through extensive experiments on eight text classification datasets, we show that our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners and performs on par with methods that rely on full-model tuning.