LG AIAug 29, 2025

Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning

arXiv:2509.02592v1

Originality Highly original

AI Analysis

This addresses the problem of class imbalance for practitioners needing robust fairness-aware models, offering a simpler alternative to synthetic augmentation methods.

The paper tackles class imbalance in machine learning by proposing group-aware threshold calibration, which sets different decision thresholds for different demographic groups. The method achieves 1.5-4% higher balanced accuracy than synthetic data generation methods like SMOTE and CT-GAN while improving worst-group balanced accuracy.

Class imbalance remains a fundamental challenge in machine learning, with traditional solutions often creating as many problems as they solve. We demonstrate that group-aware threshold calibration--setting different decision thresholds for different demographic groups--provides superior robustness compared to synthetic data generation methods. Through extensive experiments, we show that group-specific thresholds achieve 1.5-4% higher balanced accuracy than SMOTE and CT-GAN augmented models while improving worst-group balanced accuracy. Unlike single-threshold approaches that apply one cutoff across all groups, our group-aware method optimizes the Pareto frontier between balanced accuracy and worst-group balanced accuracy, enabling fine-grained control over group-level performance. Critically, we find that applying group thresholds to synthetically augmented data yields minimal additional benefit, suggesting these approaches are fundamentally redundant. Our results span seven model families including linear, tree-based, instance-based, and boosting methods, confirming that group-aware threshold calibration offers a simpler, more interpretable, and more effective solution to class imbalance.

View on arXiv PDF

Similar