LG MLNov 30, 2023

Towards Comparable Active Learning

Thorben Werner, Johannes Burchert, Lars Schmidt-Thieme

arXiv:2311.18356v22.0h-index: 5

Originality Incremental advance

AI Analysis

This addresses the issue of inconclusive and non-reproducible Active Learning research for practitioners and researchers, though it is incremental in providing a standardized evaluation framework.

The paper tackles the problem of poor generalization and unfair comparisons in Active Learning research by introducing a framework for fair algorithm evaluation across domains and a fast oracle algorithm. It presents the first AL benchmark covering Tabular, Image, and Text domains, reporting empirical results for 6 algorithms on 9 datasets and providing domain-specific rankings.

Active Learning has received significant attention in the field of machine learning for its potential in selecting the most informative samples for labeling, thereby reducing data annotation costs. However, we show that the reported lifts in recent literature generalize poorly to other domains leading to an inconclusive landscape in Active Learning research. Furthermore, we highlight overlooked problems for reproducing AL experiments that can lead to unfair comparisons and increased variance in the results. This paper addresses these issues by providing an Active Learning framework for a fair comparison of algorithms across different tasks and domains, as well as a fast and performant oracle algorithm for evaluation. To the best of our knowledge, we propose the first AL benchmark that tests algorithms in 3 major domains: Tabular, Image, and Text. We report empirical results for 6 widely used algorithms on 7 real-world and 2 synthetic datasets and aggregate them into a domain-specific ranking of AL algorithms.

View on arXiv PDF

Similar