LGMLJun 19, 2019

BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

arXiv:1906.08158v2756 citations
AI Analysis

This addresses the inefficiency and redundancy in batch data acquisition for deep Bayesian active learning, offering a solution for researchers and practitioners in machine learning.

The paper tackles the problem of selecting multiple informative data points jointly for deep Bayesian active learning by developing BatchBALD, an efficient approximation algorithm, and achieves new state-of-the-art performance on standard benchmarks with substantial data efficiency improvements.

We develop BatchBALD, a tractable approximation to the mutual information between a batch of points and model parameters, which we use as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning. BatchBALD is a greedy linear-time $1 - \frac{1}{e}$-approximate algorithm amenable to dynamic programming and efficient caching. We compare BatchBALD to the commonly used approach for batch data acquisition and find that the current approach acquires similar and redundant points, sometimes performing worse than randomly acquiring data. We finish by showing that, using BatchBALD to consider dependencies within an acquisition batch, we achieve new state of the art performance on standard benchmarks, providing substantial data efficiency improvements in batch acquisition.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes