Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
This work provides theoretical foundations for compressing neural networks, which is incremental but important for applications requiring efficient models.
The paper tackles the problem of compressing large-scale neural networks by using low displacement rank (LDR) matrices as weight matrices, proving that LDR neural networks retain universal approximation properties and efficient error bounds while reducing space and computational complexity.
Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks. Empirical results have shown that neural networks with weight matrices of LDR matrices, referred as LDR neural networks, can achieve significant reduction in space and computational complexity while retaining high accuracy. We formally study LDR matrices in deep learning. First, we prove the universal approximation property of LDR neural networks with a mild condition on the displacement operators. We then show that the error bounds of LDR neural networks are as efficient as general neural networks with both single-layer and multiple-layer structure. Finally, we propose back-propagation based training algorithm for general LDR neural networks.