CVFeb 18, 2021

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

arXiv:2102.09559v232.7327 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a realistic but understudied problem in machine learning for applications with imbalanced data, offering an incremental improvement over existing SSL methods.

The paper tackles the problem of semi-supervised learning on class-imbalanced data, where existing methods perform poorly on minority classes, and proposes CReST and CReST+ frameworks that improve state-of-the-art SSL algorithms on various datasets, consistently outperforming other rebalancing methods.

Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. While existing semi-supervised learning (SSL) methods are known to perform poorly on minority classes, we find that they still generate high precision pseudo-labels on minority classes. By exploiting this property, in this work, we propose Class-Rebalancing Self-Training (CReST), a simple yet effective framework to improve existing SSL methods on class-imbalanced data. CReST iteratively retrains a baseline SSL model with a labeled set expanded by adding pseudo-labeled samples from an unlabeled set, where pseudo-labeled samples from minority classes are selected more frequently according to an estimated class distribution. We also propose a progressive distribution alignment to adaptively adjust the rebalancing strength dubbed CReST+. We show that CReST and CReST+ improve state-of-the-art SSL algorithms on various class-imbalanced datasets and consistently outperform other popular rebalancing methods. Code has been made available at https://github.com/google-research/crest.

View on arXiv PDF Code

Similar