CLJan 30

InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning

arXiv:2601.23006v1h-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient fine-tuning for LLM practitioners by providing a domain-adaptive method that reduces data usage and improves performance, though it is incremental as it builds on existing entropy-based techniques.

The paper tackles the problem of costly and domain-specific data selection for fine-tuning large language models by introducing InstructDiff, a framework that uses differential entropy to adaptively select data, achieving a 17% relative improvement on mathematical reasoning and 52% on general instruction-following with only 10% of the data.

Supervised fine-tuning (SFT) is fundamental to adapting large language models, yet training on complete datasets incurs prohibitive costs with diminishing returns. Existing data selection methods suffer from severe domain specificity: techniques optimized for general instruction-following fail on reasoning tasks, and vice versa. We observe that measuring entropy differences between base models and minimally instruction-tuned calibrated models reveals a pattern -- samples with the lowest differential entropy consistently yield optimal performance across domains, yet this principle manifests domain-adaptively: reasoning tasks favor entropy increase (cognitive expansion), while general tasks favor entropy decrease (cognitive compression). We introduce InstructDiff, a unified framework that operationalizes differential entropy as a domain-adaptive selection criterion through warmup calibration, bi-directional NLL filtering, and entropy-based ranking. Extensive experiments show that InstructDiff achieves 17\% relative improvement over full data training on mathematical reasoning and 52\% for general instruction-following, outperforming prior baselines while using only 10\% of the data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes