Bayesian Tensorized Neural Networks with Automatic Rank Selection
This enables efficient deployment of neural networks on resource-constrained hardware by automating compression, though it is incremental as it builds on existing tensor decomposition methods.
The paper tackles the challenge of automatically selecting tensor ranks for compressing neural networks, achieving 7.4× to 137× more compact models directly from training.
Tensor decomposition is an effective approach to compress over-parameterized neural networks and to enable their deployment on resource-constrained hardware platforms. However, directly applying tensor compression in the training process is a challenging task due to the difficulty of choosing a proper tensor rank. In order to achieve this goal, this paper proposes a Bayesian tensorized neural network. Our Bayesian method performs automatic model compression via an adaptive tensor rank determination. We also present approaches for posterior density calculation and maximum a posteriori (MAP) estimation for the end-to-end training of our tensorized neural network. We provide experimental validation on a fully connected neural network, a CNN and a residual neural network where our work produces $7.4\times$ to $137\times$ more compact neural networks directly from the training.