CLLGApr 3, 2024

Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data

Amazon
arXiv:2404.02422v184 citationsh-index: 37LREC
Originality Incremental advance
AI Analysis

This addresses the problem of computational inefficiency in few-shot LLM classification for low-resource settings, though it appears incremental as it builds on existing PEFT and data generation techniques.

The paper tackles the efficiency-accuracy trade-off in low-resource text classification with LLMs, proposing a method using synthetic data and PEFT to achieve comparable or better accuracy than in-context learning while maintaining 0-shot efficiency, with competitive results on multiple datasets.

Large Language Models (LLMs) operating in 0-shot or few-shot settings achieve competitive results in Text Classification tasks. In-Context Learning (ICL) typically achieves better accuracy than the 0-shot setting, but it pays in terms of efficiency, due to the longer input prompt. In this paper, we propose a strategy to make LLMs as efficient as 0-shot text classifiers, while getting comparable or better accuracy than ICL. Our solution targets the low resource setting, i.e., when only 4 examples per class are available. Using a single LLM and few-shot real data we perform a sequence of generation, filtering and Parameter-Efficient Fine-Tuning steps to create a robust and efficient classifier. Experimental results show that our approach leads to competitive results on multiple text classification datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes