MLAug 12, 2015

Bayesian Dropout

Tue Herlau, Morten Mørup, Mikkel N. Schmidt

arXiv:1508.02905v24.07 citations

Originality Incremental advance

AI Analysis

This provides a theoretical justification for dropout, enabling its adoption in Bayesian modeling, though it is incremental as it builds on existing dropout methods.

The paper tackled the lack of a Bayesian foundation for dropout in neural networks by interpreting it as optimal inference under constraints, demonstrating this on a regression model and applying approximate techniques to Bayesian logistic regression, showing performance improvements as models become misspecified.

Dropout has recently emerged as a powerful and simple method for training neural networks preventing co-adaptation by stochastically omitting neurons. Dropout is currently not grounded in explicit modelling assumptions which so far has precluded its adoption in Bayesian modelling. Using Bayesian entropic reasoning we show that dropout can be interpreted as optimal inference under constraints. We demonstrate this on an analytically tractable regression model providing a Bayesian interpretation of its mechanism for regularizing and preventing co-adaptation as well as its connection to other Bayesian techniques. We also discuss two general approximate techniques for applying Bayesian dropout for general models, one based on an analytical approximation and the other on stochastic variational techniques. These techniques are then applied to a Baysian logistic regression problem and are shown to improve performance as the model become more misspecified. Our framework roots dropout as a theoretically justified and practical tool for statistical modelling allowing Bayesians to tap into the benefits of dropout training.

View on arXiv PDF

Similar