LG CVApr 18, 2024

VCC-INFUSE: Towards Accurate and Efficient Selection of Unlabeled Examples in Semi-supervised Learning

arXiv:2404.11947v22.61 citationsh-index: 4IJCAI

Originality Incremental advance

AI Analysis

This addresses the challenge of selecting high-quality unlabeled examples efficiently for semi-supervised learning practitioners, though it appears incremental as it builds on existing pseudo-label methods.

The paper tackles the problem of ineffective and inefficient utilization of unlabeled data in semi-supervised learning by proposing VCC-INFUSE, which reduces classification error rates and saves training time. Specifically, it reduces the error rate of FlexMatch on CIFAR-100 by 1.08% while cutting training time nearly in half.

Despite the progress of Semi-supervised Learning (SSL), existing methods fail to utilize unlabeled data effectively and efficiently. Many pseudo-label-based methods select unlabeled examples based on inaccurate confidence scores from the classifier. Most prior work also uses all available unlabeled data without pruning, making it difficult to handle large amounts of unlabeled data. To address these issues, we propose two methods: Variational Confidence Calibration (VCC) and Influence-Function-based Unlabeled Sample Elimination (INFUSE). VCC is an universal plugin for SSL confidence calibration, using a variational autoencoder to select more accurate pseudo labels based on three types of consistency scores. INFUSE is a data pruning method that constructs a core dataset of unlabeled examples under SSL. Our methods are effective in multiple datasets and settings, reducing classification errors rates and saving training time. Together, VCC-INFUSE reduces the error rate of FlexMatch on the CIFAR-100 dataset by 1.08% while saving nearly half of the training time.

View on arXiv PDF

Similar