CL AIJun 1, 2024

Phased Instruction Fine-Tuning for Large Language Models

Wei Pang, Chuan Zhou, Xiao-Hua Zhou, Xiaojie Wang

arXiv:2406.04371v214.426 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of enhancing instruction adherence for users of large language models, though it is incremental as it builds on existing fine-tuning approaches.

The paper tackles the problem of improving instruction-following in large language models by proposing Phased Instruction Fine-Tuning, which sequentially trains models on instruction data subsets of increasing difficulty, and shows it significantly outperforms existing methods on models like Llama-2 and Mistral-7B.

Instruction Fine-Tuning enhances pre-trained language models from basic next-word prediction to complex instruction-following. However, existing One-off Instruction Fine-Tuning (One-off IFT) method, applied on a diverse instruction, may not effectively boost models' adherence to instructions due to the simultaneous handling of varying instruction complexities. To improve this, Phased Instruction Fine-Tuning (Phased IFT) is proposed, based on the idea that learning to follow instructions is a gradual process. It assesses instruction difficulty using GPT-4, divides the instruction data into subsets of increasing difficulty, and uptrains the model sequentially on these subsets. Experiments with Llama-2 7B/13B/70B, Llama3 8/70B and Mistral-7B models using Alpaca data show that Phased IFT significantly outperforms One-off IFT, supporting the progressive alignment hypothesis and providing a simple and efficient way to enhance large language models. Codes and datasets from our experiments are freely available at https://github.com/xubuvd/PhasedSFT.

View on arXiv PDF Code

Similar