LG AI MLAug 31, 2025

ART: Adaptive Resampling-based Training for Imbalanced Classification

Arjun Basandrai, Shourya Jain, K. Ilanthenral

arXiv:2509.00955v1h-index: 8

Originality Incremental advance

AI Analysis

This addresses the issue of imbalanced classification for machine learning practitioners, offering a more reliable method than static approaches, though it is incremental in nature.

The paper tackles the problem of class imbalance in classification by proposing an adaptive resampling method that updates training data distribution based on class-wise performance, resulting in an average macro F1 improvement of 2.64 percentage points across tabular datasets.

Traditional resampling methods for handling class imbalance typically uses fixed distributions, undersampling the majority or oversampling the minority. These static strategies ignore changes in class-wise learning difficulty, which can limit the overall performance of the model. This paper proposes an Adaptive Resampling-based Training (ART) method that periodically updates the distribution of the training data based on the class-wise performance of the model. Specifically, ART uses class-wise macro F1 scores, computed at fixed intervals, to determine the degree of resampling to be performed. Unlike instance-level difficulty modeling, which is noisy and outlier-sensitive, ART adapts at the class level. This allows the model to incrementally shift its attention towards underperforming classes in a way that better aligns with the optimization objective. Results on diverse benchmarks, including Pima Indians Diabetes and Yeast dataset demonstrate that ART consistently outperforms both resampling-based and algorithm-level methods, including Synthetic Minority Oversampling Technique (SMOTE), NearMiss Undersampling, and Cost-sensitive Learning on binary as well as multi-class classification tasks with varying degrees of imbalance. In most settings, these improvements are statistically significant. On tabular datasets, gains are significant under paired t-tests and Wilcoxon tests (p < 0.05), while results on text and image tasks remain favorable. Compared to training on the original imbalanced data, ART improves macro F1 by an average of 2.64 percentage points across all tested tabular datasets. Unlike existing methods, whose performance varies by task, ART consistently delivers the strongest macro F1, making it a reliable choice for imbalanced classification.

View on arXiv PDF

Similar