LGAIMLSep 29, 2024

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification

arXiv:2409.19751v119 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of poor minority class performance for machine learning practitioners, but it is incremental as it compares existing methods without introducing new ones.

The study tackled class imbalance in binary classification by evaluating three strategies—SMOTE, Class Weights, and Decision Threshold Calibration—across 15 models and 30 datasets, finding that all outperformed a baseline, with Decision Threshold Calibration being most effective but results varying by dataset.

Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling class imbalance: Synthetic Minority Over-sampling Technique (SMOTE), Class Weights tuning, and Decision Threshold Calibration. We compare these methods against a baseline scenario of no-intervention across 15 diverse machine learning models and 30 datasets from various domains, conducting a total of 9,000 experiments. Performance was primarily assessed using the F1-score, although our study also tracked results on additional 9 metrics including F2-score, precision, recall, Brier-score, PR-AUC, and AUC. Our results indicate that all three strategies generally outperform the baseline, with Decision Threshold Calibration emerging as the most consistently effective technique. However, we observed substantial variability in the best-performing method across datasets, highlighting the importance of testing multiple approaches for specific problems. This study provides valuable insights for practitioners dealing with imbalanced datasets and emphasizes the need for dataset-specific analysis in evaluating class imbalance handling techniques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes