Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning
This addresses a more complex and realistic scenario in semi-supervised learning for machine learning practitioners, though it is incremental as it builds on existing SSL methods.
The paper tackles the problem of open-set semi-supervised learning, where unlabeled data contains out-of-distribution samples, by proposing a multi-task curriculum framework that jointly optimizes OOD detection and classification, achieving state-of-the-art results.
Semi-supervised learning (SSL) has been proposed to leverage unlabeled data for training powerful models when only limited labeled data is available. While existing SSL methods assume that samples in the labeled and unlabeled data share the classes of their samples, we address a more complex novel scenario named open-set SSL, where out-of-distribution (OOD) samples are contained in unlabeled data. Instead of training an OOD detector and SSL separately, we propose a multi-task curriculum learning framework. First, to detect the OOD samples in unlabeled data, we estimate the probability of the sample belonging to OOD. We use a joint optimization framework, which updates the network parameters and the OOD score alternately. Simultaneously, to achieve high performance on the classification of in-distribution (ID) data, we select ID samples in unlabeled data having small OOD scores, and use these data with labeled data for training the deep neural networks to classify ID samples in a semi-supervised manner. We conduct several experiments, and our method achieves state-of-the-art results by successfully eliminating the effect of OOD samples.