Parting with Illusions about Deep Active Learning
This work addresses the issue of inflated performance claims in deep active learning for researchers and practitioners, revealing it as incremental by highlighting flaws in existing evaluation protocols.
The paper tackles the problem of unrealistic evaluation in deep active learning, showing that many current methods barely outperform random baselines when tested under more realistic settings, including data augmentation and integration with semi-supervised learning.
Active learning aims to reduce the high labeling cost involved in training machine learning models on large datasets by efficiently labeling only the most informative samples. Recently, deep active learning has shown success on various tasks. However, the conventional evaluation scheme used for deep active learning is below par. Current methods disregard some apparent parallel work in the closely related fields. Active learning methods are quite sensitive w.r.t. changes in the training procedure like data augmentation. They improve by a large-margin when integrated with semi-supervised learning, but barely perform better than the random baseline. We re-implement various latest active learning approaches for image classification and evaluate them under more realistic settings. We further validate our findings for semantic segmentation. Based on our observations, we realistically assess the current state of the field and propose a more suitable evaluation protocol.