LGMay 30, 2025

Smooth Model Compression without Fine-Tuning

arXiv:2505.24469v1h-index: 49
Originality Incremental advance
AI Analysis

This addresses the need for efficient model deployment in real-world applications by enabling state-of-the-art compression, though it is incremental as it builds on existing pruning and compression techniques.

The paper tackles the problem of compressing large machine learning models by introducing smooth regularization during training to improve pruning and compression effectiveness, achieving up to 91% accuracy on CIFAR-10 with a ResNet-18 model that has 70% fewer parameters without fine-tuning.

Compressing and pruning large machine learning models has become a critical step towards their deployment in real-world applications. Standard pruning and compression techniques are typically designed without taking the structure of the network's weights into account, limiting their effectiveness. We explore the impact of smooth regularization on neural network training and model compression. By applying nuclear norm, first- and second-order derivative penalties of the weights during training, we encourage structured smoothness while preserving predictive performance on par with non-smooth models. We find that standard pruning methods often perform better when applied to these smooth models. Building on this observation, we apply a Singular-Value-Decomposition-based compression method that exploits the underlying smooth structure and approximates the model's weight tensors by smaller low-rank tensors. Our approach enables state-of-the-art compression without any fine-tuning - reaching up to $91\%$ accuracy on a smooth ResNet-18 on CIFAR-10 with $70\%$ fewer parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes