Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

Zhihang Yuan, Chengyu Yue, Long Huang, Litu Ou, Lei Shi

arXiv:2601.13697v1h-index: 6

Originality Incremental advance

AI Analysis

This addresses the issue of inefficient instruction tuning for LLM developers by providing an incremental improvement in data selection methods.

The paper tackles the problem of costly and noisy instruction tuning for large language models by proposing GRADFILTERING, a data selection framework that uses gradient signal-to-noise ratio to select subsets, which matches or surpasses baselines in evaluations and converges faster under the same compute budget.

Instruction tuning is a standard paradigm for adapting large language models (LLMs), but modern instruction datasets are large, noisy, and redundant, making full-data fine-tuning costly and often unnecessary. Existing data selection methods either build expensive gradient datastores or assign static scores from a weak proxy, largely ignoring evolving uncertainty, and thus missing a key source of LLM interpretability. We propose GRADFILTERING, an objective-agnostic, uncertainty-aware data selection framework that utilizes a small GPT-2 proxy with a LoRA ensemble and aggregates per-example gradients into a Gradient Signal-to-Noise Ratio (G-SNR) utility. Our method matches or surpasses random subsets and strong baselines in most LLM-as-a-judge evaluations as well as in human assessment. Moreover, GRADFILTERING-selected subsets converge faster than competitive filters under the same compute budget, reflecting the benefit of uncertainty-aware scoring.

View on arXiv PDF

Similar