LGMLJan 17, 2019

Diverse mini-batch Active Learning

arXiv:1901.05954v1173 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient data labeling in deep learning, though it is incremental as it builds on existing active learning methods.

The paper tackles the problem of reducing labeled data needed for training supervised classification models by proposing a mini-batch active learning approach that selects multiple examples at once based on informativeness and diversity, achieving comparable or better performance with improved scalability.

We study the problem of reducing the amount of labeled training data required to train supervised classification models. We approach it by leveraging Active Learning, through sequential selection of examples which benefit the model most. Selecting examples one by one is not practical for the amount of training examples required by the modern Deep Learning models. We consider the mini-batch Active Learning setting, where several examples are selected at once. We present an approach which takes into account both informativeness of the examples for the model, as well as the diversity of the examples in a mini-batch. By using the well studied K-means clustering algorithm, this approach scales better than the previously proposed approaches, and achieves comparable or better performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes