LG IT MLJul 3, 2019

Spatially-Coupled Neural Network Architectures

Arman Hasanzadeh, Nagaraj T. Janakiraman, Vamsi K. Amalladinne, Krishna R. Narayanan

arXiv:1907.02051v14.14 citations

Originality Incremental advance

AI Analysis

This work addresses the issue of parameter efficiency in neural networks for machine learning practitioners, though it is incremental as it builds on existing sparsity techniques.

The paper tackled the problem of reducing trainable parameters in fully connected neural networks by proposing a spatially-coupled architecture that allocates parameters based on feature importance, achieving a 94% reduction in training parameters while maintaining performance akin to conventional networks with dropouts.

In this work, we leverage advances in sparse coding techniques to reduce the number of trainable parameters in a fully connected neural network. While most of the works in literature impose $\ell_1$ regularization, DropOut or DropConnect techniques to induce sparsity, our scheme considers feature importance as a criterion to allocate the trainable parameters (resources) efficiently in the network. Even though sparsity is ensured, $\ell_1$ regularization requires training on all the resources in a deep neural network. The DropOut/DropConnect techniques reduce the number of trainable parameters in the training stage by dropping a random collection of neurons/edges in the hidden layers. However, both these techniques do not pay heed to the underlying structure in the data when dropping the neurons/edges. Moreover, these frameworks require a storage space equivalent to the number of parameters in a fully connected neural network. We address the above issues with a more structured architecture inspired from spatially-coupled sparse constructions. The proposed architecture is shown to have a performance akin to a conventional fully connected neural network with dropouts, and yet achieving a $94\%$ reduction in the training parameters. Extensive simulations are presented and the performance of the proposed scheme is compared against traditional neural network architectures.

View on arXiv PDF

Similar