LGAICLApr 17

Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting

arXiv:2604.1579462.4h-index: 3
Predicted impact top 35% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners deploying LLMs, SDFT offers a practical method to counteract performance loss from fine-tuning or compression, with theoretical insights into why self-distillation works.

This paper introduces Self-Distillation Fine-Tuning (SDFT) to recover performance of LLMs degraded by compression or catastrophic forgetting, and provides a theoretical explanation using Centered Kernel Alignment (CKA) to show that recovery correlates with alignment of hidden-layer manifolds. Experiments demonstrate strong correlation between performance recovery and manifold alignment.

Large Language Models (LLMs) have achieved remarkable success, underpinning diverse AI applications. However, they often suffer from performance degradation due to factors such as catastrophic forgetting during Supervised Fine-Tuning (SFT), quantization, and pruning. In this work, we introduce a performance recovery framework based on Self-Distillation Fine-Tuning (SDFT) that effectively restores model capabilities. Complementing this practical contribution, we provide a rigorous theoretical explanation for the underlying recovery mechanism. We posit that an LLM's generative capability fundamentally relies on the high-dimensional manifold constructed by its hidden layers. To investigate this, we employ Centered Kernel Alignment (CKA) to quantify the alignment between student and teacher activation trajectories, leveraging its invariance to orthogonal transformations and scaling. Our experiments demonstrate a strong correlation between performance recovery and manifold alignment, substantiating the claim that self-distillation effectively aligns the student's high-dimensional manifold with the optimal structure represented by the teacher. This study bridges the gap between practical recovery frameworks and geometric representation theory, offering new insights into the internal mechanisms of self-distillation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes