CLNov 7, 2025

Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models

arXiv:2511.05184v14 citationsh-index: 14
Originality Synthesis-oriented
AI Analysis

This addresses the problem of enhancing reasoning in smaller LLMs for natural language tasks, but it is incremental as it builds on existing CoT and distillation methods.

The paper investigates using Chain-of-Thought prompting in white-box knowledge distillation to transfer reasoning capabilities from larger to smaller large language models, showing improved average performance on natural language reasoning and understanding tasks from the BIG-Bench-Hard benchmark.

Chain-of-Thought (CoT) prompting is a widely used method to improve the reasoning capability of Large Language Models (LLMs). More recently, CoT has been leveraged in Knowledge Distillation (KD) to transfer reasoning capability from a larger LLM to a smaller one. This paper examines the role of CoT in distilling the reasoning capability from larger LLMs to smaller LLMs using white-box KD, analysing its effectiveness in improving the performance of the distilled models for various natural language reasoning and understanding tasks. We conduct white-box KD experiments using LLMs from the Qwen and Llama2 families, employing CoT data from the CoT-Collection dataset. The distilled models are then evaluated on natural language reasoning and understanding tasks from the BIG-Bench-Hard (BBH) benchmark, which presents complex challenges for smaller LLMs. Experimental results demonstrate the role of CoT in improving white-box KD effectiveness, enabling the distilled models to achieve better average performance in natural language reasoning and understanding tasks from BBH.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes