Josephine Lamp

LG
h-index67
7papers
16citations
Novelty48%
AI Score42

7 Papers

LGMar 2, 2023
GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces

Josephine Lamp, Mark Derdzinski, Christopher Hannemann et al.

We focus on the problem of generating high-quality, private synthetic glucose traces, a task generalizable to many other time series sources. Existing methods for time series data synthesis, such as those using Generative Adversarial Networks (GANs), are not able to capture the innate characteristics of glucose data and cannot provide any formal privacy guarantees without severely degrading the utility of the synthetic data. In this paper we present GlucoSynth, a novel privacy-preserving GAN framework to generate synthetic glucose traces. The core intuition behind our approach is to conserve relationships amongst motifs (glucose events) within the traces, in addition to temporal dynamics. Our framework incorporates differential privacy mechanisms to provide strong formal privacy guarantees. We provide a comprehensive evaluation on the real-world utility of the data using 1.2 million glucose traces; GlucoSynth outperforms all previous methods in its ability to generate high-quality synthetic glucose traces with strong privacy guarantees.

LGSep 23, 2024
MotifDisco: Motif Causal Discovery For Time Series Motifs

Josephine Lamp, Mark Derdzinski, Christopher Hannemann et al.

Many time series, particularly health data streams, can be best understood as a sequence of phenomenon or events, which we call \textit{motifs}. A time series motif is a short trace segment which may implicitly capture an underlying phenomenon within the time series. Specifically, we focus on glucose traces collected from continuous glucose monitors (CGMs), which inherently contain motifs representing underlying human behaviors such as eating and exercise. The ability to identify and quantify \textit{causal} relationships amongst motifs can provide a mechanism to better understand and represent these patterns, useful for improving deep learning and generative models and for advanced technology development (e.g., personalized coaching and artificial insulin delivery systems). However, no previous work has developed causal discovery methods for time series motifs. Therefore, in this paper we develop MotifDisco (\textbf{motif} \textbf{disco}very of causality), a novel causal discovery framework to learn causal relations amongst motifs from time series traces. We formalize a notion of \textit{Motif Causality (MC)}, inspired from Granger Causality and Transfer Entropy, and develop a Graph Neural Network-based framework that learns causality between motifs by solving an unsupervised link prediction problem. We integrate MC with three model use cases of forecasting, anomaly detection and clustering, to showcase the use of MC as a building block for downstream tasks. Finally, we evaluate our framework on different health data streams and find that Motif Causality provides a significant performance improvement in all use cases.

LGJun 11, 2023
CARNA: Characterizing Advanced heart failure Risk and hemodyNAmic phenotypes using learned multi-valued decision diagrams

Josephine Lamp, Yuxin Wu, Steven Lamp et al.

Early identification of high risk heart failure (HF) patients is key to timely allocation of life-saving therapies. Hemodynamic assessments can facilitate risk stratification and enhance understanding of HF trajectories. However, risk assessment for HF is a complex, multi-faceted decision-making process that can be challenging. Previous risk models for HF do not integrate invasive hemodynamics or support missing data, and use statistical methods prone to bias or machine learning methods that are not interpretable. To address these limitations, this paper presents CARNA, a hemodynamic risk stratification and phenotyping framework for advanced HF that takes advantage of the explainability and expressivity of machine learned Multi-Valued Decision Diagrams (MVDDs). This interpretable framework learns risk scores that predict the probability of patient outcomes, and outputs descriptive patient phenotypes (sets of features and thresholds) that characterize each predicted risk score. CARNA incorporates invasive hemodynamics and can make predictions on missing data. The CARNA models were trained and validated using a total of five advanced HF patient cohorts collected from previous trials, and compared with six established HF risk scores and three traditional ML risk models. CARNA provides robust risk stratification, outperforming all previous benchmarks. Although focused on advanced HF, the CARNA framework is general purpose and can be used to learn risk stratifications for other diseases and medical applications.

LGNov 23, 2022
Towards Developing Safety Assurance Cases for Learning-Enabled Medical Cyber-Physical Systems

Maryam Bagheri, Josephine Lamp, Xugui Zhou et al.

Machine Learning (ML) technologies have been increasingly adopted in Medical Cyber-Physical Systems (MCPS) to enable smart healthcare. Assuring the safety and effectiveness of learning-enabled MCPS is challenging, as such systems must account for diverse patient profiles and physiological dynamics and handle operational uncertainties. In this paper, we develop a safety assurance case for ML controllers in learning-enabled MCPS, with an emphasis on establishing confidence in the ML-based predictions. We present the safety assurance case in detail for Artificial Pancreas Systems (APS) as a representative application of learning-enabled MCPS, and provide a detailed analysis by implementing a deep neural network for the prediction in APS. We check the sufficiency of the ML data and analyze the correctness of the ML-based prediction using formal verification. Finally, we outline open research problems based on our experience in this paper.

LGJan 28Code
Safety Generalization Under Distribution Shift in Safe Reinforcement Learning: A Diabetes Testbed

Minjae Kwon, Josephine Lamp, Lu Feng

Safe Reinforcement Learning (RL) algorithms are typically evaluated under fixed training conditions. We investigate whether training-time safety guarantees transfer to deployment under distribution shift, using diabetes management as a safety-critical testbed. We benchmark safe RL algorithms on a unified clinical simulator and reveal a safety generalization gap: policies satisfying constraints during training frequently violate safety requirements on unseen patients. We demonstrate that test-time shielding, which filters unsafe actions using learned dynamics models, effectively restores safety across algorithms and patient populations. Across eight safe RL algorithms, three diabetes types, and three age groups, shielding achieves Time-in-Range gains of 13--14\% for strong baselines such as PPO-Lag and CPO while reducing clinical risk index and glucose variability. Our simulator and benchmark provide a platform for studying safety under distribution shift in safety-critical control domains. Code is available at https://github.com/safe-autonomy-lab/GlucoSim and https://github.com/safe-autonomy-lab/GlucoAlg.

LGSep 26, 2025
DM-Bench: Benchmarking LLMs for Personalized Decision Making in Diabetes Management

Maria Ana Cardei, Josephine Lamp, Mark Derdzinski et al.

We present DM-Bench, the first benchmark designed to evaluate large language model (LLM) performance across real-world decision-making tasks faced by individuals managing diabetes in their daily lives. Unlike prior health benchmarks that are either generic, clinician-facing or focused on clinical tasks (e.g., diagnosis, triage), DM-Bench introduces a comprehensive evaluation framework tailored to the unique challenges of prototyping patient-facing AI solutions in diabetes, glucose management, metabolic health and related domains. Our benchmark encompasses 7 distinct task categories, reflecting the breadth of real-world questions individuals with diabetes ask, including basic glucose interpretation, educational queries, behavioral associations, advanced decision making and long term planning. Towards this end, we compile a rich dataset comprising one month of time-series data encompassing glucose traces and metrics from continuous glucose monitors (CGMs) and behavioral logs (e.g., eating and activity patterns) from 15,000 individuals across three different diabetes populations (type 1, type 2, pre-diabetes/general health and wellness). Using this data, we generate a total of 360,600 personalized, contextual questions across the 7 tasks. We evaluate model performance on these tasks across 5 metrics: accuracy, groundedness, safety, clarity and actionability. Our analysis of 8 recent LLMs reveals substantial variability across tasks and metrics; no single model consistently outperforms others across all dimensions. By establishing this benchmark, we aim to advance the reliability, safety, effectiveness and practical utility of AI solutions in diabetes care.

AIDec 17, 2024
Quantitative Predictive Monitoring and Control for Safe Human-Machine Interaction

Shuyang Dong, Meiyi Ma, Josephine Lamp et al.

There is a growing trend toward AI systems interacting with humans to revolutionize a range of application domains such as healthcare and transportation. However, unsafe human-machine interaction can lead to catastrophic failures. We propose a novel approach that predicts future states by accounting for the uncertainty of human interaction, monitors whether predictions satisfy or violate safety requirements, and adapts control actions based on the predictive monitoring results. Specifically, we develop a new quantitative predictive monitor based on Signal Temporal Logic with Uncertainty (STL-U) to compute a robustness degree interval, which indicates the extent to which a sequence of uncertain predictions satisfies or violates an STL-U requirement. We also develop a new loss function to guide the uncertainty calibration of Bayesian deep learning and a new adaptive control method, both of which leverage STL-U quantitative predictive monitoring results. We apply the proposed approach to two case studies: Type 1 Diabetes management and semi-autonomous driving. Experiments show that the proposed approach improves safety and effectiveness in both case studies.