SUPClust: Active Learning at the Boundaries
This addresses the challenge of reducing labeling costs in machine learning applications, but it appears incremental as it builds on existing active learning paradigms.
The paper tackles the problem of expensive labeled data acquisition in active learning by proposing SUPClust, a method that identifies points at decision boundaries to refine model predictions, showing strong performance even with class imbalance.
Active learning is a machine learning paradigm designed to optimize model performance in a setting where labeled data is expensive to acquire. In this work, we propose a novel active learning method called SUPClust that seeks to identify points at the decision boundary between classes. By targeting these points, SUPClust aims to gather information that is most informative for refining the model's prediction of complex decision regions. We demonstrate experimentally that labeling these points leads to strong model performance. This improvement is observed even in scenarios characterized by strong class imbalance.