Learning active learning at the crossroads? evaluation and discussion
This work addresses the challenge of finding a universally effective active learning strategy for practitioners, a long-standing problem in reducing annotation costs.
This paper evaluates meta-learning algorithms for active learning, which aim to learn optimal sample selection strategies rather than relying on hand-designed ones. The study benchmarks a learned strategy against margin sampling combined with a Random Forest across 20 datasets.
Active learning aims to reduce annotation cost by predicting which samples are useful for a human expert to label. Although this field is quite old, several important challenges to using active learning in real-world settings still remain unsolved. In particular, most selection strategies are hand-designed, and it has become clear that there is no best active learning strategy that consistently outperforms all others in all applications. This has motivated research into meta-learning algorithms for "learning how to actively learn". In this paper, we compare this kind of approach with the association of a Random Forest with the margin sampling strategy, reported in recent comparative studies as a very competitive heuristic. To this end, we present the results of a benchmark performed on 20 datasets that compares a strategy learned using a recent meta-learning algorithm with margin sampling. We also present some lessons learned and open future perspectives.