LGMar 6, 2021

Contextual Dropout: An Efficient Sample-Dependent Dropout Module

Xinjie Fan, Shujian Zhang, Korawat Tanwisuth, Xiaoning Qian, Mingyuan Zhou

arXiv:2103.04181v118.634 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for flexible uncertainty modeling in ML applications like image classification and visual question answering, though it is incremental as it builds on existing dropout methods.

The paper tackles the problem of improving uncertainty estimation in deep neural networks by introducing a sample-dependent dropout module that is both scalable and efficient, achieving better accuracy and uncertainty quality on datasets like ImageNet and VQA 2.0.

Dropout has been demonstrated as a simple and effective module to not only regularize the training process of deep neural networks, but also provide the uncertainty estimation for prediction. However, the quality of uncertainty estimation is highly dependent on the dropout probabilities. Most current models use the same dropout distributions across all data samples due to its simplicity. Despite the potential gains in the flexibility of modeling uncertainty, sample-dependent dropout, on the other hand, is less explored as it often encounters scalability issues or involves non-trivial model changes. In this paper, we propose contextual dropout with an efficient structural design as a simple and scalable sample-dependent dropout module, which can be applied to a wide range of models at the expense of only slightly increased memory and computational cost. We learn the dropout probabilities with a variational objective, compatible with both Bernoulli dropout and Gaussian dropout. We apply the contextual dropout module to various models with applications to image classification and visual question answering and demonstrate the scalability of the method with large-scale datasets, such as ImageNet and VQA 2.0. Our experimental results show that the proposed method outperforms baseline methods in terms of both accuracy and quality of uncertainty estimation.

View on arXiv PDF Code

Similar