Low-Rank Matrix Approximation for Neural Network Compression
This addresses deployment challenges for deep neural networks by providing a more efficient compression method, though it is incremental as it builds on existing SVD-based approaches.
The paper tackles the problem of compressing neural networks to reduce memory and computation costs by introducing an Adaptive-Rank Singular Value Decomposition (ARSVD) method that adaptively selects the rank per layer using spectral entropy, resulting in improved performance with reduced space and time complexity compared to static-rank techniques.
Deep Neural Networks (DNNs) have encountered an emerging deployment challenge due to large and expensive memory and computation requirements. In this paper, we present a new Adaptive-Rank Singular Value Decomposition (ARSVD) method that approximates the optimal rank for compressing weight matrices in neural networks using spectral entropy. Unlike conventional SVD-based methods that apply a fixed-rank truncation across all layers, ARSVD uses an adaptive selection of the rank per layer through the entropy distribution of its singular values. This approach ensures that each layer will retain a certain amount of its informational content, thereby reducing redundancy. Our method enables efficient, layer-wise compression, yielding improved performance with reduced space and time complexity compared to static-rank reduction techniques.