LGMay 11, 2024

Robust Semi-supervised Learning by Wisely Leveraging Open-set Data

arXiv:2405.06979v233 citationsh-index: 4IEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This addresses a realistic challenge in semi-supervised learning for machine learning practitioners, but it is incremental as it builds on existing OSSL approaches with a new selection strategy.

The paper tackles the problem of performance degradation in semi-supervised learning when unlabeled data includes out-of-distribution (OOD) classes, by proposing WiseOpen, a framework that selectively uses a friendly subset of open-set data to improve in-distribution classification, achieving state-of-the-art results in experiments.

Open-set Semi-supervised Learning (OSSL) holds a realistic setting that unlabeled data may come from classes unseen in the labeled set, i.e., out-of-distribution (OOD) data, which could cause performance degradation in conventional SSL models. To handle this issue, except for the traditional in-distribution (ID) classifier, some existing OSSL approaches employ an extra OOD detection module to avoid the potential negative impact of the OOD data. Nevertheless, these approaches typically employ the entire set of open-set data during their training process, which may contain data unfriendly to the OSSL task that can negatively influence the model performance. This inspires us to develop a robust open-set data selection strategy for OSSL. Through a theoretical understanding from the perspective of learning theory, we propose Wise Open-set Semi-supervised Learning (WiseOpen), a generic OSSL framework that selectively leverages the open-set data for training the model. By applying a gradient-variance-based selection mechanism, WiseOpen exploits a friendly subset instead of the whole open-set dataset to enhance the model's capability of ID classification. Moreover, to reduce the computational expense, we also propose two practical variants of WiseOpen by adopting low-frequency update and loss-based selection respectively. Extensive experiments demonstrate the effectiveness of WiseOpen in comparison with the state-of-the-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes