CLAIJun 1, 2024

Phased Instruction Fine-Tuning for Large Language Models

arXiv:2406.04371v226 citationsHas Code
AI Analysis

This addresses the challenge of enhancing instruction adherence for users of large language models, though it is incremental as it builds on existing fine-tuning approaches.

The paper tackles the problem of improving instruction-following in large language models by proposing Phased Instruction Fine-Tuning, which sequentially trains models on instruction data subsets of increasing difficulty, and shows it significantly outperforms existing methods on models like Llama-2 and Mistral-7B.

Instruction Fine-Tuning enhances pre-trained language models from basic next-word prediction to complex instruction-following. However, existing One-off Instruction Fine-Tuning (One-off IFT) method, applied on a diverse instruction, may not effectively boost models' adherence to instructions due to the simultaneous handling of varying instruction complexities. To improve this, Phased Instruction Fine-Tuning (Phased IFT) is proposed, based on the idea that learning to follow instructions is a gradual process. It assesses instruction difficulty using GPT-4, divides the instruction data into subsets of increasing difficulty, and uptrains the model sequentially on these subsets. Experiments with Llama-2 7B/13B/70B, Llama3 8/70B and Mistral-7B models using Alpaca data show that Phased IFT significantly outperforms One-off IFT, supporting the progressive alignment hypothesis and providing a simple and efficient way to enhance large language models. Codes and datasets from our experiments are freely available at https://github.com/xubuvd/PhasedSFT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes