Dropout-based Active Learning for Regression
This work addresses the problem of computationally efficient active learning for regression in machine learning, offering a method that can be generalized to other deep architectures, though it appears incremental in nature.
The paper tackles the challenge of active learning for high-dimensional regression with expensive annotations by proposing a fast algorithm based on dropout uncertainty estimation for neural networks, achieving comparable or better performance than baselines on synthetic and real-world datasets.
Active learning is relevant and challenging for high-dimensional regression models when the annotation of the samples is expensive. Yet most of the existing sampling methods cannot be applied to large-scale problems, consuming too much time for data processing. In this paper, we propose a fast active learning algorithm for regression, tailored for neural network models. It is based on uncertainty estimation from stochastic dropout output of the network. Experiments on both synthetic and real-world datasets show comparable or better performance (depending on the accuracy metric) as compared to the baselines. This approach can be generalized to other deep learning architectures. It can be used to systematically improve a machine-learning model as it offers a computationally efficient way of sampling additional data.