CVOct 9, 2021

Class-Balanced Active Learning for Image Classification

arXiv:2110.04543v138 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of active learning for image classification in real-world, long-tail distributed datasets, offering an incremental improvement by integrating class-balancing into existing methods.

The paper tackles the problem of active learning on imbalanced datasets by proposing a class-balanced optimization framework, showing performance gains on both imbalanced and balanced datasets across three datasets.

Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we showed that also on balanced datasets our method generally results in a performance gain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes