BatchGFN: Generative Flow Networks for Batch Active Learning
This addresses the computational bottleneck in batch-aware active learning for machine learning practitioners, though it is incremental as it builds on existing generative flow networks and active learning frameworks.
The paper tackles the problem of batch active learning by proposing BatchGFN, a method that uses generative flow networks to sample data batches proportional to a reward function like joint mutual information, enabling near-optimal batch selection with low computational cost in toy regression tasks.
We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active learning in a principled way. We show our approach enables sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems. This alleviates the computational complexity of batch-aware algorithms and removes the need for greedy approximations to find maximizers for the batch reward. We also present early results for amortizing training across acquisition steps, which will enable scaling to real-world tasks.