LGAIJan 31, 2023

Does Deep Active Learning Work in the Wild?

arXiv:2302.00098v21 citationsh-index: 27
AI Analysis

This work highlights a critical limitation in deploying deep active learning in real-world settings, presenting it as an open problem for researchers in machine learning.

The study evaluated eleven deep active learning methods on eight benchmarks by varying a key hyperparameter, finding that most methods sometimes underperform random sampling, with only three consistently outperforming it using diversity-based criteria.

Deep active learning (DAL) methods have shown significant improvements in sample efficiency compared to simple random sampling. While these studies are valuable, they nearly always assume that optimal DAL hyperparameter (HP) settings are known in advance, or optimize the HPs through repeating DAL several times with different HP settings. Here, we argue that in real-world settings, or in the wild, there is significant uncertainty regarding good HPs, and their optimization contradicts the premise of using DAL (i.e., we require labeling efficiency). In this study, we evaluate the performance of eleven modern DAL methods on eight benchmark problems as we vary a key HP shared by all methods: the pool ratio. Despite adjusting only one HP, our results indicate that eight of the eleven DAL methods sometimes underperform relative to simple random sampling and some frequently perform worse. Only three methods always outperform random sampling (albeit narrowly), and we find that these methods all utilize diversity to select samples - a relatively simple criterion. Our findings reveal the limitations of existing DAL methods when deployed in the wild, and present this as an important new open problem in the field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes