LGCVMLJul 16, 2024

ProSub: Probabilistic Open-Set Semi-Supervised Learning with Subspace-Based Out-of-Distribution Detection

arXiv:2407.11735v26 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of handling unknown classes in unlabeled data for machine learning practitioners, representing an incremental improvement over existing methods.

The paper tackles open-set semi-supervised learning by proposing ProSub, a method that uses subspace-based out-of-distribution detection and probabilistic predictions, achieving state-of-the-art performance on several benchmarks.

In open-set semi-supervised learning (OSSL), we consider unlabeled datasets that may contain unknown classes. Existing OSSL methods often use the softmax confidence for classifying data as in-distribution (ID) or out-of-distribution (OOD). Additionally, many works for OSSL rely on ad-hoc thresholds for ID/OOD classification, without considering the statistics of the problem. We propose a new score for ID/OOD classification based on angles in feature space between data and an ID subspace. Moreover, we propose an approach to estimate the conditional distributions of scores given ID or OOD data, enabling probabilistic predictions of data being ID or OOD. These components are put together in a framework for OSSL, termed ProSub, that is experimentally shown to reach SOTA performance on several benchmark problems. Our code is available at https://github.com/walline/prosub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes