Efficient Model Compression for Bayesian Neural Networks
This addresses the need for efficient deployment of Bayesian neural networks on resource-constrained devices.
The paper tackles the problem of model compression for Bayesian neural networks by developing a novel pruning strategy based on posterior inclusion probabilities from spike-and-slab priors, resulting in better generalizability across simulated and real-world benchmarks.
Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories, and resistance to adversarial attacks. This may be achieved via weight pruning or fully discarding certain input features. Here we demonstrate a novel strategy to emulate principles of Bayesian model selection in a deep learning setup. Given a fully connected Bayesian neural network with spike-and-slab priors trained via a variational algorithm, we obtain the posterior inclusion probability for every node that typically gets lost. We employ these probabilities for pruning and feature selection on a host of simulated and real-world benchmark data and find evidence of better generalizability of the pruned model in all our experiments.