LGJan 25, 2022

Cold Start Active Learning Strategies in the Context of Imbalanced Classification

arXiv:2201.10227v12 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of initializing classification with no labels in imbalanced datasets, which is incremental as it builds on existing active learning methods.

The paper tackles the cold start problem in active learning for imbalanced classification by proposing strategies that combine clustering and label propagation to address label scarcity and class imbalance, demonstrating effectiveness in boosting recall for the minority class in a Twitter case study on flood event testimonies.

We present novel active learning strategies dedicated to providing a solution to the cold start stage, i.e. initializing the classification of a large set of data with no attached labels. Moreover, proposed strategies are designed to handle an imbalanced context in which random selection is highly inefficient. Specifically, our active learning iterations address label scarcity and imbalance using element scores, combining information extracted from a clustering structure to a label propagation model. The strategy is illustrated by a case study on annotating Twitter content w.r.t. testimonies of a real flood event. We show that our method effectively copes with class imbalance, by boosting the recall of samples from the minority class.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes