CVJul 21, 2025

Local Dense Logit Relations for Enhanced Knowledge Distillation

arXiv:2507.15911v19 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in knowledge distillation for machine learning practitioners by enhancing logit-based methods, though it is incremental as it builds on existing logit distillation approaches.

The paper tackles the problem of insufficient fine-grained relationship modeling in logit distillation by proposing Local Dense Relational Logit Distillation (LDRLD) with an Adaptive Decay Weight strategy, achieving improved student performance on datasets like CIFAR-100, ImageNet-1K, and Tiny-ImageNet compared to state-of-the-art methods.

State-of-the-art logit distillation methods exhibit versatility, simplicity, and efficiency. Despite the advances, existing studies have yet to delve thoroughly into fine-grained relationships within logit knowledge. In this paper, we propose Local Dense Relational Logit Distillation (LDRLD), a novel method that captures inter-class relationships through recursively decoupling and recombining logit information, thereby providing more detailed and clearer insights for student learning. To further optimize the performance, we introduce an Adaptive Decay Weight (ADW) strategy, which can dynamically adjust the weights for critical category pairs using Inverse Rank Weighting (IRW) and Exponential Rank Decay (ERD). Specifically, IRW assigns weights inversely proportional to the rank differences between pairs, while ERD adaptively controls weight decay based on total ranking scores of category pairs. Furthermore, after the recursive decoupling, we distill the remaining non-target knowledge to ensure knowledge completeness and enhance performance. Ultimately, our method improves the student's performance by transferring fine-grained knowledge and emphasizing the most critical relationships. Extensive experiments on datasets such as CIFAR-100, ImageNet-1K, and Tiny-ImageNet demonstrate that our method compares favorably with state-of-the-art logit-based distillation approaches. The code will be made publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes