Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding
This work addresses the need for uncertainty measures in semantic segmentation for decision-making in visual scene understanding, offering an incremental improvement by applying existing uncertainty methods to new architectures.
The paper tackles the problem of pixel-wise semantic segmentation for scene understanding by introducing Bayesian SegNet, a framework that predicts pixel class labels with model uncertainty using Monte Carlo dropout at test time. The result is a 2-3% improvement in segmentation performance across architectures like SegNet, FCN, and Dilation Network, with no extra parameters, and better performance on smaller datasets.
We present a deep learning framework for probabilistic pixel-wise semantic segmentation, which we term Bayesian SegNet. Semantic segmentation is an important tool for visual scene understanding and a meaningful measure of uncertainty is essential for decision making. Our contribution is a practical system which is able to predict pixel-wise class labels with a measure of model uncertainty. We achieve this by Monte Carlo sampling with dropout at test time to generate a posterior distribution of pixel class labels. In addition, we show that modelling uncertainty improves segmentation performance by 2-3% across a number of state of the art architectures such as SegNet, FCN and Dilation Network, with no additional parametrisation. We also observe a significant improvement in performance for smaller datasets where modelling uncertainty is more effective. We benchmark Bayesian SegNet on the indoor SUN Scene Understanding and outdoor CamVid driving scenes datasets.