LGSep 28, 2025Code
HyMaTE: A Hybrid Mamba and Transformer Model for EHR Representation LearningMd Mozaharul Mottalib, Thao-Ly T. Phan, Rahmatollah Beheshti
Electronic health Records (EHRs) have become a cornerstone in modern-day healthcare. They are a crucial part for analyzing the progression of patient health; however, their complexity, characterized by long, multivariate sequences, sparsity, and missing values poses significant challenges in traditional deep learning modeling. While Transformer-based models have demonstrated success in modeling EHR data and predicting clinical outcomes, their quadratic computational complexity and limited context length hinder their efficiency and practical applications. On the other hand, State Space Models (SSMs) like Mamba present a promising alternative offering linear-time sequence modeling and improved efficiency for handling long sequences, but focus mostly on mixing sequence-level information rather than channel-level data. To overcome these challenges, we propose HyMaTE (A Hybrid Mamba and Transformer Model for EHR Representation Learning), a novel hybrid model tailored for representing longitudinal data, combining the strengths of SSMs with advanced attention mechanisms. By testing the model on predictive tasks on multiple clinical datasets, we demonstrate HyMaTE's ability to capture an effective, richer, and more nuanced unified representation of EHR data. Additionally, the interpretability of the outcomes achieved by self-attention illustrates the effectiveness of our model as a scalable and generalizable solution for real-world healthcare applications. Codes are available at: https://github.com/healthylaife/HyMaTE.
LGDec 12, 2024
An Interoperable Machine Learning Pipeline for Pediatric Obesity Risk EstimationHamed Fayyaz, Mehak Gupta, Alejandra Perez Ramirez et al.
Reliable prediction of pediatric obesity can offer a valuable resource to providers, helping them engage in timely preventive interventions before the disease is established. Many efforts have been made to develop ML-based predictive models of obesity, and some studies have reported high predictive performances. However, no commonly used clinical decision support tool based on existing ML models currently exists. This study presents a novel end-to-end pipeline specifically designed for pediatric obesity prediction, which supports the entire process of data extraction, inference, and communication via an API or a user interface. While focusing only on routinely recorded data in pediatric electronic health records (EHRs), our pipeline uses a diverse expert-curated list of medical concepts to predict the 1-3 years risk of developing obesity. Furthermore, by using the Fast Healthcare Interoperability Resources (FHIR) standard in our design procedure, we specifically target facilitating low-effort integration of our pipeline with different EHR systems. In our experiments, we report the effectiveness of the predictive model as well as its alignment with the feedback from various stakeholders, including ML scientists, providers, health IT personnel, health administration representatives, and patient group representatives.
LGFeb 3, 2022
Who will Leave a Pediatric Weight Management Program and When? -- A machine learning approach for predicting attrition patternsHamed Fayyaz, Thao-Ly T. Phan, H. Timothy Bunnell et al.
Childhood obesity is a major public health concern. Multidisciplinary pediatric weight management programs are considered standard treatment for children with obesity and severe obesity who are not able to be successfully managed in the primary care setting; however, high drop-out rates (referred to as attrition) are a major hurdle in delivering successful interventions. Predicting attrition patterns can help providers reduce the attrition rates. Previous work has mainly focused on finding static predictors of attrition using statistical analysis methods. In this study, we present a machine learning model to predict (a) the likelihood of attrition, and (b) the change in body-mass index (BMI) percentile of children, at different time points after joining a weight management program. We use a five-year dataset containing the information related to around 4,550 children that we have compiled using data from the Nemours Pediatric Weight Management program. Our models show strong prediction performance as determined by high AUROC scores across different tasks (average AUROC of 0.75 for predicting attrition, and 0.73 for predicting weight outcomes). Additionally, we report the top features predicting attrition and weight outcomes in a series of explanatory experiments.
APDec 5, 2019
Obesity Prediction with EHR Data: A deep learning approach with interpretable elementsMehak Gupta, Thao-Ly T. Phan, Timothy Bunnell et al.
Childhood obesity is a major public health challenge. Early prediction and identification of the children at a high risk of developing childhood obesity may help in engaging earlier and more effective interventions to prevent and manage obesity. Most existing predictive tools for childhood obesity primarily rely on traditional regression-type methods using only a few hand-picked features and without exploiting longitudinal patterns of children data. Deep learning methods allow the use of high-dimensional longitudinal datasets. In this paper, we present a deep learning model designed for predicting future obesity patterns from generally available items on children medical history. To do this, we use a large unaugmented electronic health records dataset from a large pediatric health system. We adopt a general LSTM network architecture which are known to better represent the longitudinal data. We train our proposed model on both dynamic and static EHR data. Our model is used to predict obesity for ages between 2-20 years. We compared the performance of our LSTM model with other machine learning methods that aggregate over sequential data and ignore temporality. To add interpretability, we have additionally included an attention layer to calculate the attention scores for the timestamps and rank features of each timestamp.