Improved Bayesian Compression
This addresses the problem of deploying neural networks efficiently on mobile devices and at scale, though it appears incremental as it combines existing methods.
The authors tackled neural network compression by combining Soft-Weight Sharing and Variational Dropout approaches, achieving a new state-of-the-art result in model compression.
Compression of Neural Networks (NN) has become a highly studied topic in recent years. The main reason for this is the demand for industrial scale usage of NNs such as deploying them on mobile devices, storing them efficiently, transmitting them via band-limited channels and most importantly doing inference at scale. In this work, we propose to join the Soft-Weight Sharing and Variational Dropout approaches that show strong results to define a new state-of-the-art in terms of model compression.