LGMay 13

A Systematic Evaluation of Imbalance Handling Methods in Biomedical Binary Classification

Jiandong Chen, Lingjie Su, Le Peng, Yash Travadi, Rui Zhang, Ju Sun

arXiv:2605.1414726.6

AI Analysis

Provides practical guidance for researchers and practitioners in biomedical binary classification on which IHMs to use based on model complexity and data modality.

This study systematically evaluated five imbalance handling methods (IHMs) across three biomedical datasets (tabular, text, image) and various model complexities. Results showed that ROS and RW consistently improved performance for complex models, while RUS and SMOTE degraded it; simpler models saw no significant benefit from IHMs.

Objective: The primary goal of this study was to systematically examine the impact of commonly used imbalance handling methods (IHMs) on predictive performance in biomedical binary classification, considering the interplay between model complexity and diverse data modalities. Material and Methods: We evaluated five representative IHMs: random undersampling (RUS), random oversampling (ROS), SMOTE, re-weighting (RW), and direct F1-score optimization (DMO), against a raw training (RAW) baseline. The evaluation encompassed three public biomedical datasets: MIMIC-III (tabular), ADE-Corpus-V2 (text), and MURA (image), spanning three common biomedical data modalities. To assess varying model complexity, we employed a range of architectures, from classical logistic regression and random forest to deep neural networks, including multilayer perceptron (MLP), BiLSTM, BERT, DenseNet, and DINOv2. Results: For simpler models such as logistic regression on tabular data, IHMs yielded no significant advantage over the RAW baseline, aligning with prior findings. However, clear benefits were observed for more complex models and unstructured data: (a) ROS and RW consistently enhanced the performance of powerful models; (b) direct F1-score optimization demonstrated utility primarily for unstructured text and image data; and (c) RUS and SMOTE consistently degraded performance and are therefore not recommended. Conclusion: The effectiveness of IHMs depends on both model complexity and data modality. Performance gains are most pronounced when leveraging appropriate IHMs, such as ROS, RW, and DMO, on high-complexity models.

View on arXiv PDF

Similar