LGAICVAug 12, 2022

USB: A Unified Semi-supervised Learning Benchmark for Classification

CMUPeking U
arXiv:2208.07204v2146 citationsh-index: 137Has Code
Originality Synthesis-oriented
AI Analysis

This provides a standardized and cost-effective evaluation framework for SSL researchers, though it is incremental as it builds on existing SSL methods and datasets.

The authors tackled the lack of a comprehensive and efficient benchmark for semi-supervised learning (SSL) across multiple domains by constructing USB, a unified benchmark with 15 tasks from computer vision, natural language processing, and audio processing, which reduces evaluation cost to 39 GPU days on a single NVIDIA V100 compared to 335 GPU days for existing methods.

Semi-supervised learning (SSL) improves model generalization by leveraging massive unlabeled data to augment limited labeled samples. However, currently, popular SSL evaluation protocols are often constrained to computer vision (CV) tasks. In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issues, we construct a Unified SSL Benchmark (USB) for classification by selecting 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which we systematically evaluate the dominant SSL methods, and also open-source a modular and extensible codebase for fair evaluation of these SSL methods. We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning. USB enables the evaluation of a single SSL algorithm on more tasks from multiple domains but with less cost. Specifically, on a single NVIDIA V100, only 39 GPU days are required to evaluate FixMatch on 15 tasks in USB while 335 GPU days (279 GPU days on 4 CV datasets except for ImageNet) are needed on 5 CV tasks with TorchSSL.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes