LGMLJul 12, 2018

Practical Obstacles to Deploying Active Learning

arXiv:1807.04801v31053 citations
Originality Synthesis-oriented
AI Analysis

This highlights practical obstacles for researchers and practitioners deploying AL, showing it is incremental in revealing limitations rather than offering broad improvements.

The paper tackles the problem of active learning (AL) not reliably generalizing across models and tasks, finding that its benefits are modest and inconsistent, with successor models trained on actively-acquired data not consistently outperforming those on i.i.d. sampled data.

Active learning (AL) is a widely-used training strategy for maximizing predictive performance subject to a fixed annotation budget. In AL one iteratively selects training examples for annotation, often those for which the current model is most uncertain (by some measure). The hope is that active sampling leads to better performance than would be achieved under independent and identically distributed (i.i.d.) random samples. While AL has shown promise in retrospective evaluations, these studies often ignore practical obstacles to its use. In this paper we show that while AL may provide benefits when used with specific models and for particular domains, the benefits of current approaches do not generalize reliably across models and tasks. This is problematic because in practice one does not have the opportunity to explore and compare alternative AL strategies. Moreover, AL couples the training dataset with the model used to guide its acquisition. We find that subsequently training a successor model with an actively-acquired dataset does not consistently outperform training on i.i.d. sampled data. Our findings raise the question of whether the downsides inherent to AL are worth the modest and inconsistent performance gains it tends to afford.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes