ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios
This addresses the challenge of reducing annotation efforts in few-shot learning for text classification, offering a solution that can also benefit non-few-shot scenarios and help other active learning strategies overcome cold-start issues.
The paper tackles the cold-start problem in active learning for few-shot textual scenarios by introducing ActiveLLM, which uses large language models like GPT-4 to select instances, resulting in significant performance gains for BERT classifiers, outperforming traditional methods and improving few-shot learning approaches such as ADAPET, PERFECT, and SetFit.
Active learning is designed to minimize annotation efforts by prioritizing instances that most enhance learning. However, many active learning strategies struggle with a `cold-start' problem, needing substantial initial data to be effective. This limitation reduces their utility in the increasingly relevant few-shot scenarios, where the instance selection has a substantial impact. To address this, we introduce ActiveLLM, a novel active learning approach that leverages Large Language Models such as GPT-4, o1, Llama 3, or Mistral Large for selecting instances. We demonstrate that ActiveLLM significantly enhances the classification performance of BERT classifiers in few-shot scenarios, outperforming traditional active learning methods as well as improving the few-shot learning methods ADAPET, PERFECT, and SetFit. Additionally, ActiveLLM can be extended to non-few-shot scenarios, allowing for iterative selections. In this way, ActiveLLM can even help other active learning strategies to overcome their cold-start problem. Our results suggest that ActiveLLM offers a promising solution for improving model performance across various learning setups.