ReMix: Calibrated Resampling for Class Imbalance in Deep learning
This work tackles the problem of class imbalance for deep learning models, which is crucial for decision support systems in critical areas like health, medicine, transportation, and finance.
This paper addresses class imbalance in deep learning, proposing ReMix, a training technique that combines batch resampling, instance mixing, and soft-labels. The method improves g-mean and balanced Brier score compared to alternatives, demonstrating better performance and calibration.
Class imbalance is a problem of significant importance in applied deep learning where trained models are exploited for decision support and automated decisions in critical areas such as health and medicine, transportation, and finance. The challenge of learning deep models from imbalanced training data remains high, and the state-of-the-art solutions are typically data dependent and primarily focused on image data. Real-world imbalanced classification problems, however, are much more diverse thus necessitating a general solution that can be applied to tabular, image and text data. In this paper, we propose ReMix, a training technique that leverages batch resampling, instance mixing and soft-labels to enable the induction of robust deep models for imbalanced learning. Our results show that dense nets and CNNs trained with ReMix generally outperform the alternatives according to the g-mean and are better calibrated according to the balanced Brier score.