Hán Dān Xué Bù (Mimicry) or Qīng Chū Yú Lán (Mastery)? A Cognitive Perspective on Reasoning Distillation in Large Language Models

Yueqing Hu, Xinyang Peng, Shuting Peng, Hanqi Wang, Tianhong Wang

arXiv:2601.05019v1

Originality Incremental advance

AI Analysis

This reveals a critical flaw in current distillation methods for large language models, showing that human-like cognition emerges from reinforcement learning, not imitation, which is incremental for AI reasoning research.

The study found that reasoning distillation via Supervised Fine-Tuning fails to transmit cognitive structure from teacher models to students, degrading alignment with human difficulty scaling from r=0.64 to r=0.34 and causing negative transfer.

Recent Large Reasoning Models trained via reinforcement learning exhibit a "natural" alignment with human cognitive costs. However, we show that the prevailing paradigm of reasoning distillation -- training student models to mimic these traces via Supervised Fine-Tuning (SFT) -- fails to transmit this cognitive structure. Testing the "Hán Dān Xué Bù" (Superficial Mimicry) hypothesis across 14 models, we find that distillation induces a "Functional Alignment Collapse": while teacher models mirror human difficulty scaling ($\bar{r}=0.64$), distilled students significantly degrade this alignment ($\bar{r}=0.34$), often underperforming their own pre-distillation baselines ("Negative Transfer"). Our analysis suggests that SFT induces a "Cargo Cult" effect, where students ritualistically replicate the linguistic form of reasoning (verbosity) without internalizing the teacher's dynamic resource allocation policy. Consequently, reasoning distillation decouples computational cost from cognitive demand, revealing that human-like cognition is an emergent property of active reinforcement, not passive imitation.

View on arXiv PDF

Similar