Leveraging Label Proportion Prior for Class-Imbalanced Semi-Supervised Learning
This work addresses the problem of class-imbalance in semi-supervised learning, which is significant for applications where labeled data is scarce and class distribution is imbalanced.
The authors tackled class-imbalance in semi-supervised learning and achieved improved performance, with competitive or superior results compared to existing methods, particularly under scarce-label conditions. Experiments on Long-tailed CIFAR-10 showed consistent improvements over baselines across imbalance severities and label ratios.
Semi-supervised learning (SSL) often suffers under class imbalance, where pseudo-labeling amplifies majority bias and suppresses minority performance. We address this issue with a lightweight framework that, to our knowledge, is the first to introduce Proportion Loss from learning from label proportions (LLP) into SSL as a regularization term. Proportion Loss aligns model predictions with the global class distribution, mitigating bias across both majority and minority classes. To further stabilize training, we formulate a stochastic variant that accounts for fluctuations in mini-batch composition. Experiments on the Long-tailed CIFAR-10 benchmark show that integrating Proportion Loss into FixMatch and ReMixMatch consistently improves performance over the baselines across imbalance severities and label ratios, and achieves competitive or superior results compared to existing CISSL methods, particularly under scarce-label conditions.