MLLGJan 15

Classification Imbalance as Transfer Learning

arXiv:2601.10630v11 citationsh-index: 16
Originality Incremental advance
AI Analysis

This provides theoretical guidance for choosing augmentation strategies in imbalanced classification, but it is incremental as it builds on existing methods like SMOTE.

The paper tackles classification imbalance by framing it as transfer learning under label shift, showing that the excess risk includes a cost of transfer term quantifying discrepancies in minority-class distribution estimation, and finds that bootstrapping outperforms SMOTE in moderately high dimensions.

Classification imbalance arises when one class is much rarer than the other. We frame this setting as transfer learning under label (prior) shift between an imbalanced source distribution induced by the observed data and a balanced target distribution under which performance is evaluated. Within this framework, we study a family of oversampling procedures that augment the training data by generating synthetic samples from an estimated minority-class distribution to roughly balance the classes, among which the celebrated SMOTE algorithm is a canonical example. We show that the excess risk decomposes into the rate achievable under balanced training (as if the data had been drawn from the balanced target distribution) and an additional term, the cost of transfer, which quantifies the discrepancy between the estimated and true minority-class distributions. In particular, we show that the cost of transfer for SMOTE dominates that of bootstrapping (random oversampling) in moderately high dimensions, suggesting that we should expect bootstrapping to have better performance than SMOTE in general. We corroborate these findings with experimental evidence. More broadly, our results provide guidance for choosing among augmentation strategies for imbalanced classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes