Enhancing Clinical Predictive Modeling through Model Complexity-Driven Class Proportion Tuning for Class Imbalanced Data: An Empirical Study on Opioid Overdose Prediction
This work addresses performance degradation in clinical predictive models due to class imbalance, offering a model-specific tuning approach that is incremental over existing rebalancing techniques.
The paper tackles the class imbalance problem in clinical predictive modeling by proposing that optimal class proportions should be linked to model complexity, rather than being fixed based on the original data. Experiments on opioid overdose prediction show performance gains, with regression analysis confirming a statistically significant correlation between model complexity hyperparameters and optimal class proportions.
Class imbalance problems widely exist in the medical field and heavily deteriorates performance of clinical predictive models. Most techniques to alleviate the problem rebalance class proportions and they predominantly assume the rebalanced proportions should be a function of the original data and oblivious to the model one uses. This work challenges this prevailing assumption and proposes that links the optimal class proportions to the model complexity, thereby tuning the class proportions per model. Our experiments on the opioid overdose prediction problem highlight the performance gain of tuning class proportions. Rigorous regression analysis also confirms the advantages of the theoretical framework proposed and the statistically significant correlation between the hyperparameters controlling the model complexity and the optimal class proportions.