AIJan 16, 2025

Aligning Instruction Tuning with Pre-training

Tsinghua
arXiv:2501.09368v48 citationsh-index: 25
Originality Incremental advance
AI Analysis

This addresses a bottleneck in enhancing LLMs for diverse tasks, though it is incremental as it builds on existing instruction-tuning methods.

The paper tackles the problem of instruction-tuning datasets being misaligned with pre-training distributions, which limits LLM generalization, by proposing AITP to rewrite underrepresented pre-training data into instruction-response pairs, resulting in consistent performance improvements across eight benchmarks.

Instruction tuning enhances large language models (LLMs) to follow human instructions across diverse tasks, relying on high-quality datasets to guide behavior. However, these datasets, whether manually curated or synthetically generated, are often narrowly focused and misaligned with the broad distributions captured during pre-training, limiting LLM generalization and effective use of pre-trained knowledge. We propose Aligning Instruction Tuning with Pre-training (AITP), a method that bridges this gap by identifying coverage shortfalls in instruction-tuning datasets and rewriting underrepresented pre-training data into high-quality instruction-response pairs. This approach enriches dataset diversity while preserving task-specific objectives. Evaluations on three fully open LLMs across eight benchmarks demonstrate consistent performance improvements with AITP. Ablations highlight the benefits of adaptive data selection, controlled rewriting, and balanced integration, emphasizing the importance of aligning instruction tuning with pre-training distributions to unlock the full potential of LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes