CLLGMLAug 16, 2018

Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

arXiv:1808.05697v31162 citations
AI Analysis

This work addresses the challenge of applying active learning to real-world NLP problems by providing empirical guidance for practitioners, though it is incremental as it builds on existing methods.

The paper conducted a large-scale empirical study on deep Bayesian active learning for NLP, finding that Bayesian active learning by disagreement with uncertainty estimates from Dropout or Bayes-by-Backprop consistently outperforms i.i.d. baselines and often beats classic uncertainty sampling across various tasks, datasets, and models.

Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the course of one AL run, an agent annotates its dataset exhausting its labeling budget. Thus, given a new task, an active learner has no opportunity to compare models and acquisition functions. This paper provides a large scale empirical study of deep active learning, addressing multiple tasks and, for each, multiple datasets, multiple models, and a full suite of acquisition functions. We find that across all settings, Bayesian active learning by disagreement, using uncertainty estimates provided either by Dropout or Bayes-by Backprop significantly improves over i.i.d. baselines and usually outperforms classic uncertainty sampling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes