Diverse mini-batch Active Learning
This addresses the need for efficient data labeling in deep learning, though it is incremental as it builds on existing active learning methods.
The paper tackles the problem of reducing labeled data needed for training supervised classification models by proposing a mini-batch active learning approach that selects multiple examples at once based on informativeness and diversity, achieving comparable or better performance with improved scalability.
We study the problem of reducing the amount of labeled training data required to train supervised classification models. We approach it by leveraging Active Learning, through sequential selection of examples which benefit the model most. Selecting examples one by one is not practical for the amount of training examples required by the modern Deep Learning models. We consider the mini-batch Active Learning setting, where several examples are selected at once. We present an approach which takes into account both informativeness of the examples for the model, as well as the diversity of the examples in a mini-batch. By using the well studied K-means clustering algorithm, this approach scales better than the previously proposed approaches, and achieves comparable or better performance.