ML LG CO MEMay 22, 2018

Parsimonious Bayesian deep networks

arXiv:1805.08719v311 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of architecture selection in deep learning for practitioners, offering an automated and efficient solution, though it is incremental as it builds on existing Bayesian nonparametrics and greedy learning methods.

The paper tackles the problem of automatically inferring neural network architectures without cross-validation or fine-tuning by developing parsimonious Bayesian deep networks (PBDNs), which achieve state-of-the-art classification accuracy and provide interpretable data subtypes near decision boundaries while maintaining low computational complexity for predictions.

Combining Bayesian nonparametrics and a forward model selection strategy, we construct parsimonious Bayesian deep networks (PBDNs) that infer capacity-regularized network architectures from the data and require neither cross-validation nor fine-tuning when training the model. One of the two essential components of a PBDN is the development of a special infinite-wide single-hidden-layer neural network, whose number of active hidden units can be inferred from the data. The other one is the construction of a greedy layer-wise learning algorithm that uses a forward model selection criterion to determine when to stop adding another hidden layer. We develop both Gibbs sampling and stochastic gradient descent based maximum a posteriori inference for PBDNs, providing state-of-the-art classification accuracy and interpretable data subtypes near the decision boundaries, while maintaining low computational complexity for out-of-sample prediction.

View on arXiv PDF

Similar