CVSep 26, 2024

SCOMatch: Alleviating Overtrusting in Open-set Semi-supervised Learning

Zerun Wang, Liuyu Xiang, Lang Huang, Jiafeng Mao, Ling Xiao, Toshihiko Yamasaki

arXiv:2409.17512v17.67 citationsh-index: 36Has Code

Originality Incremental advance

AI Analysis

This addresses a specific bottleneck in open-set semi-supervised learning for machine learning practitioners, offering an incremental improvement over existing methods.

The paper tackles the problem of overtrusting in open-set semi-supervised learning, where prior methods overfit due to distribution bias in labeled data, and proposes SCOMatch to treat out-of-distribution samples as an additional class, achieving significant performance improvements over state-of-the-art methods on various benchmarks.

Open-set semi-supervised learning (OSSL) leverages practical open-set unlabeled data, comprising both in-distribution (ID) samples from seen classes and out-of-distribution (OOD) samples from unseen classes, for semi-supervised learning (SSL). Prior OSSL methods initially learned the decision boundary between ID and OOD with labeled ID data, subsequently employing self-training to refine this boundary. These methods, however, suffer from the tendency to overtrust the labeled ID data: the scarcity of labeled data caused the distribution bias between the labeled samples and the entire ID data, which misleads the decision boundary to overfit. The subsequent self-training process, based on the overfitted result, fails to rectify this problem. In this paper, we address the overtrusting issue by treating OOD samples as an additional class, forming a new SSL process. Specifically, we propose SCOMatch, a novel OSSL method that 1) selects reliable OOD samples as new labeled data with an OOD memory queue and a corresponding update strategy and 2) integrates the new SSL process into the original task through our Simultaneous Close-set and Open-set self-training. SCOMatch refines the decision boundary of ID and OOD classes across the entire dataset, thereby leading to improved results. Extensive experimental results show that SCOMatch significantly outperforms the state-of-the-art methods on various benchmarks. The effectiveness is further verified through ablation studies and visualization.

View on arXiv PDF Code

Similar