CL AIOct 17, 2024

Semi-supervised Fine-tuning for Large Language Models

Junyu Luo, Xiao Luo, Xiusi Chen, Zhiping Xiao, Wei Ju, Ming Zhang

Peking U

arXiv:2410.14745v211.215 citationsh-index: 30Has CodeNAACL

Originality Incremental advance

AI Analysis

This addresses a practical bottleneck in adapting LLMs to specific domains when labeled data is scarce, though it appears to be an incremental improvement over existing fine-tuning methods.

The paper tackles the problem of limited labeled data for supervised fine-tuning of large language models by introducing a semi-supervised framework called SemiEvol, which propagates knowledge from labeled to unlabeled data and selects high-quality pseudo-responses, resulting in significant performance improvements on seven datasets.

Supervised fine-tuning (SFT) is crucial in adapting large language model (LLMs) to a specific domain or task. However, only a limited amount of labeled data is available in practical applications, which poses a severe challenge for SFT in yielding satisfactory results. Therefore, a data-efficient framework that can fully exploit labeled and unlabeled data for LLM fine-tuning is highly anticipated.Towards this end, we introduce a semi-supervised fine-tuning(SemiFT) task and a framework named SemiEvol for LLM alignment from a propagate-and-select manner. For knowledge propagation, SemiEvol adopts a bi-level approach, propagating knowledge from labeled data to unlabeled data through both in-weight and in-context methods. For knowledge selection, SemiEvol incorporates a collaborative learning mechanism, selecting higher-quality pseudo-response samples. We conducted experiments using GPT-4o-mini and Llama-3.1 on seven general or domain-specific datasets, demonstrating significant improvements in model performance on target data. Furthermore, we compared SemiEvol with SFT and self-evolution methods, highlighting its practicality in hybrid data scenarios.

View on arXiv PDF Code

Similar