Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference
This work addresses the challenge of architecture search in BNNs for researchers and practitioners, offering theoretical guarantees and computational feasibility, though it is incremental in advancing node-sparse methods.
The paper tackles the problem of finding appropriate sparse architectures for Bayesian neural networks (BNNs) by proposing a node-sparse BNN model with near minimax optimal posterior concentration rates and adaptiveness to model smoothness, and develops a novel MCMC algorithm for practical inference.
Bayesian approaches for learning deep neural networks (BNN) have been received much attention and successfully applied to various applications. Particularly, BNNs have the merit of having better generalization ability as well as better uncertainty quantification. For the success of BNN, search an appropriate architecture of the neural networks is an important task, and various algorithms to find good sparse neural networks have been proposed. In this paper, we propose a new node-sparse BNN model which has good theoretical properties and is computationally feasible. We prove that the posterior concentration rate to the true model is near minimax optimal and adaptive to the smoothness of the true model. In particular the adaptiveness is the first of its kind for node-sparse BNNs. In addition, we develop a novel MCMC algorithm which makes the Bayesian inference of the node-sparse BNN model feasible in practice.