CLNov 27, 2023

FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models

arXiv:2311.15614v1142 citationsh-index: 17Has Code
Originality Highly original
AI Analysis

This work addresses the high cost of labeled data collection for NLP tasks by eliminating human intervention, offering a novel approach in the LLM era.

The paper tackles the problem of reducing annotation costs in NLP by proposing FreeAL, a collaborative learning framework that uses an LLM as an annotator and an SLM to filter high-quality samples, achieving enhanced zero-shot performance without human supervision across eight benchmark datasets.

Collecting high-quality labeled data for model training is notoriously time-consuming and labor-intensive for various NLP tasks. While copious solutions, such as active learning for small language models (SLMs) and prevalent in-context learning in the era of large language models (LLMs), have been proposed and alleviate the labeling burden to some extent, their performances are still subject to human intervention. It is still underexplored how to reduce the annotation cost in the LLMs era. To bridge this, we revolutionize traditional active learning and propose an innovative collaborative learning framework FreeAL to interactively distill and filter the task-specific knowledge from LLMs. During collaborative training, an LLM serves as an active annotator inculcating its coarse-grained knowledge, while a downstream SLM is incurred as a student to filter out high-quality in-context samples to feedback LLM for the subsequent label refinery. Extensive experiments on eight benchmark datasets demonstrate that FreeAL largely enhances the zero-shot performances for both SLM and LLM without any human supervision. The code is available at https://github.com/Justherozen/FreeAL .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes