Algorithm Selection for Deep Active Learning with Imbalanced Datasets
This work addresses the challenge of label efficiency in deep learning for practitioners dealing with imbalanced datasets, though it is incremental as it builds on existing active learning methods.
The paper tackles the problem of selecting the best active learning algorithm for a given dataset, proposing TAILOR, an adaptive algorithm selection strategy that achieves accuracy comparable to or better than the best candidate algorithms in multi-class and multi-label applications.
Label efficiency has become an increasingly important objective in deep learning applications. Active learning aims to reduce the number of labeled examples needed to train deep networks, but the empirical performance of active learning algorithms can vary dramatically across datasets and applications. It is difficult to know in advance which active learning strategy will perform well or best in a given application. To address this, we propose the first adaptive algorithm selection strategy for deep active learning. For any unlabeled dataset, our (meta) algorithm TAILOR (Thompson ActIve Learning algORithm selection) iteratively and adaptively chooses among a set of candidate active learning algorithms. TAILOR uses novel reward functions aimed at gathering class-balanced examples. Extensive experiments in multi-class and multi-label applications demonstrate TAILOR's effectiveness in achieving accuracy comparable or better than that of the best of the candidate algorithms. Our implementation of TAILOR is open-sourced at https://github.com/jifanz/TAILOR.