CVJul 7, 2023

Training Ensembles with Inliers and Outliers for Semi-supervised Active Learning

Vladan Stojnić, Zakaria Laskar, Giorgos Tolias

arXiv:2307.03741v13.93 citationsh-index: 28Has Code

Originality Incremental advance

AI Analysis

This work addresses a realistic problem in semi-supervised active learning for practitioners dealing with noisy data, though it appears incremental as it integrates existing components.

The paper tackles the challenge of deep active learning with outlier examples by proposing an approach that combines joint classifier training with inliers and outliers, semi-supervised learning via pseudo-labeling, and model ensembling, resulting in improved classifier accuracy and outperforming other methods in performance.

Deep active learning in the presence of outlier examples poses a realistic yet challenging scenario. Acquiring unlabeled data for annotation requires a delicate balance between avoiding outliers to conserve the annotation budget and prioritizing useful inlier examples for effective training. In this work, we present an approach that leverages three highly synergistic components, which are identified as key ingredients: joint classifier training with inliers and outliers, semi-supervised learning through pseudo-labeling, and model ensembling. Our work demonstrates that ensembling significantly enhances the accuracy of pseudo-labeling and improves the quality of data acquisition. By enabling semi-supervision through the joint training process, where outliers are properly handled, we observe a substantial boost in classifier accuracy through the use of all available unlabeled examples. Notably, we reveal that the integration of joint training renders explicit outlier detection unnecessary; a conventional component for acquisition in prior work. The three key components align seamlessly with numerous existing approaches. Through empirical evaluations, we showcase that their combined use leads to a performance increase. Remarkably, despite its simplicity, our proposed approach outperforms all other methods in terms of performance. Code: https://github.com/vladan-stojnic/active-outliers

View on arXiv PDF Code

Similar