AILGMay 24, 2025

The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

arXiv:2505.18759v15 citationsh-index: 3Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of optimizing reasoning model efficiency for AI researchers and practitioners, but it is incremental as it focuses on benchmarking rather than novel method development.

The paper tackles the lack of a comprehensive benchmark for data-centric distillation in chain-of-thought reasoning by introducing DC-CoT, which systematically evaluates data manipulation techniques across various teacher and student models, showing performance impacts on reasoning datasets.

Data-centric distillation, including data augmentation, selection, and mixing, offers a promising path to creating smaller, more efficient student Large Language Models (LLMs) that retain strong reasoning abilities. However, there still lacks a comprehensive benchmark to systematically assess the effect of each distillation approach. This paper introduces DC-CoT, the first data-centric benchmark that investigates data manipulation in chain-of-thought (CoT) distillation from method, model and data perspectives. Utilizing various teacher models (e.g., o4-mini, Gemini-Pro, Claude-3.5) and student architectures (e.g., 3B, 7B parameters), we rigorously evaluate the impact of these data manipulations on student model performance across multiple reasoning datasets, with a focus on in-distribution (IID) and out-of-distribution (OOD) generalization, and cross-domain transfer. Our findings aim to provide actionable insights and establish best practices for optimizing CoT distillation through data-centric techniques, ultimately facilitating the development of more accessible and capable reasoning models. The dataset can be found at https://huggingface.co/datasets/rana-shahroz/DC-COT, while our code is shared in https://anonymous.4open.science/r/DC-COT-FF4C/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes