CVLGJul 5, 2022

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning

arXiv:2207.02261v271 citationsh-index: 95
Originality Highly original
AI Analysis

It addresses the limitation of standard SSL in real-world scenarios with distribution mismatch, enabling broader applicability in domains like image classification.

The paper tackles the open-world semi-supervised learning problem, where labeled and unlabeled data come from different distributions, by introducing OpenLDN to recognize known classes and detect/cluster novel classes, achieving state-of-the-art performance on multiple benchmarks with improved accuracy/training time trade-offs.

Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning. Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data. One common assumption in most SSL methods is that the labeled and unlabeled data are from the same data distribution. However, this is hardly the case in many real-world scenarios, which limits their applicability. In this work, instead, we attempt to solve the challenging open-world SSL problem that does not make such an assumption. In the open-world SSL problem, the objective is to recognize samples of known classes, and simultaneously detect and cluster samples belonging to novel classes present in unlabeled data. This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes. Using a bi-level optimization rule this pairwise similarity loss exploits the information available in the labeled set to implicitly cluster novel class samples, while simultaneously recognizing samples from known classes. After discovering novel classes, OpenLDN transforms the open-world SSL problem into a standard SSL problem to achieve additional performance gains using existing SSL methods. Our extensive experiments demonstrate that OpenLDN outperforms the current state-of-the-art methods on multiple popular classification benchmarks while providing a better accuracy/training time trade-off.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes