CVAILGJun 17, 2025

Model compression using knowledge distillation with integrated gradients

arXiv:2506.14440v1h-index: 3
Originality Incremental advance
AI Analysis

This provides a practical compression technique for edge device deployment with competitive accuracy, though it appears incremental as it builds on existing knowledge distillation methods.

The paper tackles model compression for deep learning deployment on resource-constrained devices by enhancing knowledge distillation with integrated gradients as a data augmentation strategy, achieving 92.6% accuracy on CIFAR-10 with a 4.1x compression factor and reducing inference time from 140 ms to 13 ms.

Model compression is critical for deploying deep learning models on resource-constrained devices. We introduce a novel method enhancing knowledge distillation with integrated gradients (IG) as a data augmentation strategy. Our approach overlays IG maps onto input images during training, providing student models with deeper insights into teacher models' decision-making processes. Extensive evaluation on CIFAR-10 demonstrates that our IG-augmented knowledge distillation achieves 92.6% testing accuracy with a 4.1x compression factor-a significant 1.1 percentage point improvement ($p<0.001$) over non-distilled models (91.5%). This compression reduces inference time from 140 ms to 13 ms. Our method precomputes IG maps before training, transforming substantial runtime costs into a one-time preprocessing step. Our comprehensive experiments include: (1) comparisons with attention transfer, revealing complementary benefits when combined with our approach; (2) Monte Carlo simulations confirming statistical robustness; (3) systematic evaluation of compression factor versus accuracy trade-offs across a wide range (2.2x-1122x); and (4) validation on an ImageNet subset aligned with CIFAR-10 classes, demonstrating generalisability beyond the initial dataset. These extensive ablation studies confirm that IG-based knowledge distillation consistently outperforms conventional approaches across varied architectures and compression ratios. Our results establish this framework as a viable compression technique for real-world deployment on edge devices while maintaining competitive accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes