CVAug 21, 2020

Training Sparse Neural Networks using Compressed Sensing

Jonathan W. Siegel, Jianhong Chen, Pengchuan Zhang, Jinchao Xu

arXiv:2008.09661v23.37 citationsHas Code

Originality Highly original

AI Analysis

This addresses the need for efficient and accurate sparse neural networks in machine learning applications, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of reducing neural network size and complexity by introducing a novel method based on compressed sensing that combines pruning and training into a single step, achieving sparser and more accurate networks on datasets like CIFAR-10, CIFAR-100, and ImageNet compared to state-of-the-art methods.

Pruning the weights of neural networks is an effective and widely-used technique for reducing model size and inference complexity. We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step. Specifically, we utilize an adaptively weighted $\ell^1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks. The adaptive weighting we introduce corresponds to a novel regularizer based on the logarithm of the absolute value of the weights. We perform a series of ablation studies demonstrating the improvement provided by the adaptive weighting and generalized RDA algorithm. Furthermore, numerical experiments on the CIFAR-10, CIFAR-100, and ImageNet datasets demonstrate that our method 1) trains sparser, more accurate networks than existing state-of-the-art methods; 2) can be used to train sparse networks from scratch, i.e. from a random initialization, as opposed to initializing with a well-trained base model; 3) acts as an effective regularizer, improving generalization accuracy.

View on arXiv PDF Code

Similar