MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning
This work addresses the challenge of efficient model compression for neural networks, offering a practical solution for applications requiring reduced computational resources, though it appears incremental as it builds on existing pruning techniques.
The paper tackled the problem of optimizing neural network size for better runtime and memory performance by developing MPruner, a pruning algorithm that uses mutual information and CKA-based layer clustering to incorporate global information, achieving up to a 50% reduction in parameters and memory usage with minimal accuracy loss.
Determining the optimal size of a neural network is critical, as it directly impacts runtime performance and memory usage. Pruning is a well-established model compression technique that reduces the size of neural networks while mathematically guaranteeing accuracy preservation. However, many recent pruning methods overlook the global contributions of individual model components, making it difficult to ensure that a pruned model meets the desired dataset and performance requirements. To address these challenges, we developed a new pruning algorithm, MPruner, that leverages mutual information through vector similarity. MPruner utilizes layer clustering with the Centered Kernel Alignment (CKA) similarity metric, allowing us to incorporate global information from the neural network for more precise and efficient layer-wise pruning. We evaluated MPruner across various architectures and configurations, demonstrating its versatility and providing practical guidelines. MPruner achieved up to a 50% reduction in parameters and memory usage for CNN and transformer-based models, with minimal to no loss in accuracy.