CLAIApr 6, 2023

Instruction Tuning with GPT-4

Microsoft
arXiv:2304.03277v1822 citationsh-index: 59
Originality Incremental advance
AI Analysis

This work improves instruction tuning for LLMs, potentially benefiting AI developers and researchers, though it appears incremental as it builds on prior methods with a newer model.

The researchers tackled the problem of generating instruction-following data for LLM finetuning by using GPT-4 instead of previous models, resulting in superior zero-shot performance on new tasks with 52K English and Chinese data points.

Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are needed. In this paper, we present the first attempt to use GPT-4 to generate instruction-following data for LLM finetuning. Our early experiments on instruction-tuned LLaMA models show that the 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks to the instruction-following data generated by previous state-of-the-art models. We also collect feedback and comparison data from GPT-4 to enable a comprehensive evaluation and reward model training. We make our data generated using GPT-4 as well as our codebase publicly available.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes