LGMEMLJul 26, 2024

Boosted generalized normal distributions: Integrating machine learning with operations knowledge

arXiv:2407.19092v22 citationsh-index: 24
Originality Incremental advance
AI Analysis

This work addresses the need for better distributional forecasts in operational settings like healthcare, though it is incremental in integrating existing knowledge with ML techniques.

The paper tackled the problem of machine learning methods lacking distributional information and not incorporating operations knowledge by introducing the Boosted Generalized Normal Distribution (bGND) methodology, which improved distributional forecasting of patient wait and service times by 6% and 9% respectively, leading to a 9% increase in patient satisfaction and a 4% reduction in mortality for myocardial infarction patients.

Applications of machine learning (ML) techniques to operational settings often face two challenges: i) ML methods mostly provide point predictions whereas many operational problems require distributional information; and ii) They typically do not incorporate the extensive body of knowledge in the operations literature, particularly the theoretical and empirical findings that characterize specific distributions. We introduce a novel and rigorous methodology, the Boosted Generalized Normal Distribution ($b$GND), to address these challenges. The Generalized Normal Distribution (GND) encompasses a wide range of parametric distributions commonly encountered in operations, and $b$GND leverages gradient boosting with tree learners to flexibly estimate the parameters of the GND as functions of covariates. We establish $b$GND's statistical consistency, thereby extending this key property to special cases studied in the ML literature that lacked such guarantees. Using data from a large academic emergency department in the United States, we show that the distributional forecasting of patient wait and service times can be meaningfully improved by leveraging findings from the healthcare operations literature. Specifically, $b$GND performs 6% and 9% better than the distribution-agnostic ML benchmark used to forecast wait and service times respectively. Further analysis suggests that these improvements translate into a 9% increase in patient satisfaction and a 4% reduction in mortality for myocardial infarction patients. Our work underscores the importance of integrating ML with operations knowledge to enhance distributional forecasts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes