CLJan 15

Skill-Aware Data Selection and Fine-Tuning for Data-Efficient Reasoning Distillation

Lechen Zhang, Yunxiang Zhang, Wei Hu, Lu Wang

arXiv:2601.10109v1h-index: 4

Originality Incremental advance

AI Analysis

This addresses the need for more efficient training methods in reasoning model distillation, offering incremental improvements for AI practitioners working with limited data.

The paper tackled the problem of data-efficient training for distilling reasoning models by proposing a skill-centric distillation framework, achieving performance gains of +1.6% on Qwen3-4B and +1.4% on Qwen3-8B across five benchmarks with only 1,000 training examples.

Large reasoning models such as DeepSeek-R1 and their distilled variants achieve strong performance on complex reasoning tasks. Yet, distilling these models often demands large-scale data for supervised fine-tuning (SFT), motivating the pursuit of data-efficient training methods. To address this, we propose a skill-centric distillation framework that efficiently transfers reasoning ability to weaker models with two components: (1) Skill-based data selection, which prioritizes examples targeting the student model's weaker skills, and (2) Skill-aware fine-tuning, which encourages explicit skill decomposition during problem solving. With only 1,000 training examples selected from a 100K teacher-generated corpus, our method surpasses random SFT baselines by +1.6% on Qwen3-4B and +1.4% on Qwen3-8B across five mathematical reasoning benchmarks. Further analysis confirms that these gains concentrate on skills emphasized during training, highlighting the effectiveness of skill-centric training for efficient reasoning distillation.

View on arXiv PDF

Similar