LGFeb 8, 2023Code
Exploratory Analysis of Federated Learning Methods with Differential Privacy on MIMIC-IIIAron N. Horvath, Matteo Berchier, Farhad Nooralahzadeh et al.
Background: Federated learning methods offer the possibility of training machine learning models on privacy-sensitive data sets, which cannot be easily shared. Multiple regulations pose strict requirements on the storage and usage of healthcare data, leading to data being in silos (i.e. locked-in at healthcare facilities). The application of federated algorithms on these datasets could accelerate disease diagnostic, drug development, as well as improve patient care. Methods: We present an extensive evaluation of the impact of different federation and differential privacy techniques when training models on the open-source MIMIC-III dataset. We analyze a set of parameters influencing a federated model performance, namely data distribution (homogeneous and heterogeneous), communication strategies (communication rounds vs. local training epochs), federation strategies (FedAvg vs. FedProx). Furthermore, we assess and compare two differential privacy (DP) techniques during model training: a stochastic gradient descent-based differential privacy algorithm (DP-SGD), and a sparse vector differential privacy technique (DP-SVT). Results: Our experiments show that extreme data distributions across sites (imbalance either in the number of patients or the positive label ratios between sites) lead to a deterioration of model performance when trained using the FedAvg strategy. This issue is resolved when using FedProx with the use of appropriate hyperparameter tuning. Furthermore, the results show that both differential privacy techniques can reach model performances similar to those of models trained without DP, however at the expense of a large quantifiable privacy leakage. Conclusions: We evaluate empirically the benefits of two federation strategies and propose optimal strategies for the choice of parameters when using differential privacy techniques.
CLJul 18, 2024Code
BiasDPO: Mitigating Bias in Language Models through Direct Preference OptimizationAhmed Allam
Large Language Models (LLMs) have become pivotal in advancing natural language processing, yet their potential to perpetuate biases poses significant concerns. This paper introduces a new framework employing Direct Preference Optimization (DPO) to mitigate gender, racial, and religious biases in LLM-generated English text. By developing a loss function that favors less biased over biased completions, our approach cultivates a preference for respectful and non-discriminatory language in LLMs. We also contribute a manually designed dataset for training LLMs to recognize and correct biases. This dataset encompasses a diverse range of prompts paired with both biased and unbiased completions. Implementing this approach on the Microsoft Phi-2 model, we demonstrate substantial reductions in biased outputs as our model outperforms the baseline model on almost all bias benchmarks. Our model also achieves better performance compared to other open-source models on most benchmarks. By reducing biases in the language generated by the model, our study marks a significant step towards developing more ethical and socially responsible LLMs. We publicly release BiasDPO dataset on HuggingFace.
LGMar 31, 2023
Simple Contrastive Representation Learning for Time Series ForecastingXiaochen Zheng, Xingyu Chen, Manuel Schürch et al.
Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series forecasting, as optimization of instance discrimination is not directly applicable to predicting the future state from the historical context. To address these limitations, we propose SimTS, a simple representation learning approach for improving time series forecasting by learning to predict the future from the past in the latent space. SimTS exclusively uses positive pairs and does not depend on negative pairs or specific characteristics of a given time series. In addition, we show the shortcomings of the current contrastive learning framework used for time series forecasting through a detailed ablation study. Overall, our work suggests that SimTS is a promising alternative to other contrastive learning approaches for time series forecasting.
MLSep 28, 2023
Generating Personalized Insulin Treatments Strategies with Deep Conditional Generative Time Series ModelsManuel Schürch, Xiang Li, Ahmed Allam et al.
We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of novel multivariate treatment strategies tailored to the personalized patient history and trained for optimal expected future outcomes based on conditional expected utility maximization. We demonstrate our framework by generating personalized insulin treatment strategies and blood glucose predictions for hospitalized diabetes patients, showcasing the potential of our approach for generating improved personalized treatment strategies. Keywords: deep generative model, probabilistic decision support, personalized treatment generation, insulin and blood glucose prediction
GNNov 13, 2023
Attention-based Multi-task Learning for Base Editor Outcome PredictionAmina Mollaysa, Ahmed Allam, Michael Krauthammer
Human genetic diseases often arise from point mutations, emphasizing the critical need for precise genome editing techniques. Among these, base editing stands out as it allows targeted alterations at the single nucleotide level. However, its clinical application is hindered by low editing efficiency and unintended mutations, necessitating extensive trial-and-error experimentation in the laboratory. To speed up this process, we present an attention-based two-stage machine learning model that learns to predict the likelihood of all possible editing outcomes for a given genomic target sequence. We further propose a multi-task learning schema to jointly learn multiple base editors (i.e. variants) at once. Our model's predictions consistently demonstrated a strong correlation with the actual experimental results on multiple datasets and base editor variants. These results provide further validation for the models' capacity to enhance and accelerate the process of refining base editing designs.
LGOct 4, 2023
Attention-based Multi-task Learning for Base Editor Outcome PredictionAmina Mollaysa, Ahmed Allam, Michael Krauthammer
Human genetic diseases often arise from point mutations, emphasizing the critical need for precise genome editing techniques. Among these, base editing stands out as it allows targeted alterations at the single nucleotide level. However, its clinical application is hindered by low editing efficiency and unintended mutations, necessitating extensive trial-and-error experimentation in the laboratory. To speed up this process, we present an attention-based two-stage machine learning model that learns to predict the likelihood of all possible editing outcomes for a given genomic target sequence. We further propose a multi-task learning schema to jointly learn multiple base editors (i.e. variants) at once. Our model's predictions consistently demonstrated a strong correlation with the actual experimental results on multiple datasets and base editor variants. These results provide further validation for the models' capacity to enhance and accelerate the process of refining base editing designs.
QMOct 3, 2022
DDoS: A Graph Neural Network based Drug Synergy Prediction AlgorithmKyriakos Schwarz, Alicia Pliego-Mendieta, Amina Mollaysa et al.
Drug synergy arises when the combined impact of two drugs exceeds the sum of their individual effects. While single-drug effects on cell lines are well-documented, the scarcity of data on drug synergy, considering the vast array of potential drug combinations, prompts a growing interest in computational approaches for predicting synergies in untested drug pairs. We introduce a Graph Neural Network (\textit{GNN}) based model for drug synergy prediction, which utilizes drug chemical structures and cell line gene expression data. We extract data from the largest available drug combination database (DrugComb) and generate multiple synergy scores (commonly used in the literature) to create seven datasets that serve as a reliable benchmark with high confidence. In contrast to conventional models relying on pre-computed chemical features, our GNN-based approach learns task-specific drug representations directly from the graph structure of the drugs, providing superior performance in predicting drug synergies. Our work suggests that learning task-specific drug representations and leveraging a diverse dataset is a promising approach to advancing our understanding of drug-drug interaction and synergy.
43.5CVMay 24
Universal Boosts, Specific Suppressors: Sparse Autoencoder Steering of Medical Vision-Language ModelsFarhad Nooralahzadeh, Benjamin Gundersen, Nicolas Deperrois et al.
Medical vision-language models (VLMs) often hallucinate findings when generating chest X-ray reports: they fabricate findings that are not present in the image, miss important ones, or locate them incorrectly. We mitigate this without weight updates by decoding-time residual steering on a per-token sparse autoencoder (SAE) basis: Top-$K$ SAEs on late layers, causal steering against clinical errors, then combined suppress/boost intervention at inference time. On the MIMIC-CXR test split, our inference-only method improves the quality of generated reports for three radiology VLMs (RadVLM, LLaVA-Rad, and CheXOne), with relative improvements of +5.4%, +7.2%, and +17.0% in the clinical composite metric, and statistically significant GREEN gains on all backbones. A cross-model feature alignment shows that the quality-promoting (boost) directions overlap strongly across architectures, whereas hallucination-linked (suppress) directions are model-specific. Therefore, transferable steering must treat suppression per-backbone, rather than sharing a universal suppress list. The same recipe transfers zero-shot to IU-Xray (Green $+7.7\%$ rel.) without retraining, confirming that the identified features are properties of the model, not of the training corpus. We release causal feature sets and an interactive feature dashboard: https://cxr-sparse-feature-dashboard.netlify.app/.
LGJul 16, 2024
Semi-Supervised Generative Models for Disease Trajectories: A Case Study on Systemic SclerosisCécile Trottet, Manuel Schürch, Ahmed Allam et al.
We propose a deep generative approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories, with a particular focus on Systemic Sclerosis (SSc). We aim to learn temporal latent representations of the underlying generative process that explain the observed patient disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical knowledge. By combining the generative approach with medical definitions of different characteristics of SSc, we facilitate the discovery of new aspects of the disease. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering SSc patient trajectories into novel sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series with uncertainty quantification.
LGNov 14, 2023
Modeling Complex Disease Trajectories using Deep Generative Models with Semi-Supervised Latent ProcessesCécile Trottet, Manuel Schürch, Ahmed Allam et al.
In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical concepts. By combining the generative approach with medical knowledge, we leverage the ability to discover novel aspects of the disease while integrating medical concepts into the model. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering the disease into new sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. We demonstrate the effectiveness of our approach in modeling systemic sclerosis, showcasing the potential of our machine learning model to capture complex disease trajectories and acquire new medical knowledge.
LGFeb 2
Repurposing Protein Language Models for Latent Flow-Based Fitness OptimizationAmaru Caceres Arroyo, Lea Bogensperger, Ahmed Allam et al.
Protein fitness optimization is challenged by a vast combinatorial landscape where high-fitness variants are extremely sparse. Many current methods either underperform or require computationally expensive gradient-based sampling. We present CHASE, a framework that repurposes the evolutionary knowledge of pretrained protein language models by compressing their embeddings into a compact latent space. By training a conditional flow-matching model with classifier-free guidance, we enable the direct generation of high-fitness variants without predictor-based guidance during the ODE sampling steps. CHASE achieves state-of-the-art performance on AAV and GFP protein design benchmarks. Finally, we show that bootstrapping with synthetic data can further enhance performance in data-constrained settings.
LGNov 13, 2023
Two-Stage Aggregation with Dynamic Local Attention for Irregular Time SeriesXingyu Chen, Xiaochen Zheng, Amina Mollaysa et al.
Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two-stageAggregation process with Dynamic local Attention to harmonize time-wise and feature-wise irregularities in multivariate time series. In the first stage, the irregular time series undergoes temporal embedding (TE) using all available features at each time step. This process preserves the contribution of each available feature and generates a fixed-dimensional representation per time step. The second stage introduces a dynamic local attention (DLA) mechanism with adaptive window sizes. DLA aggregates time recordings using feature-specific windows to harmonize irregular time intervals capturing feature-specific sampling rates. Then hierarchical MLP mixer layers process the output of DLA through multiscale patching to leverage information at various scales for the downstream tasks. TADA outperforms state-of-the-art methods on three real-world datasets, including the latest MIMIC IV dataset, and highlights its effectiveness in handling irregular multivariate time series and its potential for various real-world applications.
ARAug 21, 2025Code
ASIC-Agent: An Autonomous Multi-Agent System for ASIC Design with Benchmark EvaluationAhmed Allam, Youssef Mansour, Mohamed Shalan
Large Language Models (LLMs) have demonstrated remarkable capabilities in Register Transfer Level (RTL) design, enabling high-quality code generation from natural language descriptions. However, LLMs alone face significant limitations in real-world hardware design workflows, including the inability to execute code, lack of debugging capabilities, and absence of long-term memory. To address these challenges, we present ASIC-Agent, an autonomous system designed specifically for digital ASIC design tasks. ASIC-Agent enhances base LLMs with a multi-agent architecture incorporating specialized sub-agents for RTL generation, verification, OpenLane hardening, and Caravel chip integration, all operating within a comprehensive sandbox environment with access to essential hardware design tools. The system leverages a vector database containing documentation, API references, error knowledge, and curated insights from the open-source silicon community. To evaluate ASIC-Agent's performance, we introduce ASIC-Agent-Bench, the first benchmark specifically designed to assess agentic systems in hardware design tasks. We evaluate ASIC-Agent with various base LLMs, providing quantitative comparisons and qualitative insights into agent behavior across different design scenarios. Our results demonstrate that ASIC-Agent, when powered by Claude 4 Sonnet, successfully automates a broad range of ASIC design tasks spanning varying levels of complexity, showing the potential of significantly accelerating the ASIC design workflow.
LGJan 31, 2025
A Variational Perspective on Generative Protein Fitness OptimizationLea Bogensperger, Dominik Narnhofer, Ahmed Allam et al.
The goal of protein fitness optimization is to discover new protein variants with enhanced fitness for a given use. The vast search space and the sparsely populated fitness landscape, along with the discrete nature of protein sequences, pose significant challenges when trying to determine the gradient towards configurations with higher fitness. We introduce Variational Latent Generative Protein Optimization (VLGPO), a variational perspective on fitness optimization. Our method embeds protein sequences in a continuous latent space to enable efficient sampling from the fitness distribution and combines a (learned) flow matching prior over sequence mutations with a fitness predictor to guide optimization towards sequences with high fitness. VLGPO achieves state-of-the-art results on two different protein benchmarks of varying complexity. Moreover, the variational design with explicit prior and likelihood functions offers a flexible plug-and-play framework that can be easily customized to suit various protein design tasks.
LGJan 10, 2025
TAMER: A Test-Time Adaptive MoE-Driven Framework for EHR Representation LearningYinghao Zhu, Xiaochen Zheng, Ahmed Allam et al.
We propose TAMER, a Test-time Adaptive MoE-driven framework for Electronic Health Record (EHR) Representation learning. TAMER introduces a framework where a Mixture-of-Experts (MoE) architecture is co-designed with Test-Time Adaptation (TTA) to jointly mitigate the intertwined challenges of patient heterogeneity and distribution shifts in EHR modeling. The MoE focuses on latent patient subgroups through domain-aware expert specialization, while TTA enables real-time adaptation to evolving health status distributions when new patient samples are introduced. Extensive experiments across four real-world EHR datasets demonstrate that TAMER consistently improves predictive performance for both mortality and readmission risk tasks when combined with diverse EHR modeling backbones. TAMER offers a promising approach for dynamic and personalized EHR-based predictions in practical clinical settings.
LGMay 6, 2024
Clustering of Disease Trajectories with Explainable Machine Learning: A Case Study on Postoperative Delirium PhenotypesXiaochen Zheng, Manuel Schürch, Xingyu Chen et al.
The identification of phenotypes within complex diseases or syndromes is a fundamental component of precision medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several distinct phenotypes, which cannot be directly observed in clinical practice. Identifying these phenotypes could enhance our understanding of POD pathogenesis and facilitate the development of targeted prevention and treatment strategies. In this paper, we propose an approach that combines supervised machine learning for personalized POD risk prediction with unsupervised clustering techniques to uncover potential POD phenotypes. We first demonstrate our approach using synthetic data, where we simulate patient cohorts with predefined phenotypes based on distinct sets of informative features. We aim to mimic any clinical disease with our synthetic data generation method. By training a predictive model and applying SHAP, we show that clustering patients in the SHAP feature importance space successfully recovers the true underlying phenotypes, outperforming clustering in the raw feature space. We then present a case study using real-world data from a cohort of elderly surgical patients. The results showcase the utility of our approach in uncovering clinically relevant subtypes of complex disorders like POD, paving the way for more precise and personalized treatment strategies.
QMDec 24, 2020
AttentionDDI: Siamese Attention-based Deep Learning method for drug-drug interaction predictionsKyriakos Schwarz, Ahmed Allam, Nicolas Andres Perez Gonzalez et al.
Background: Drug-drug interactions (DDIs) refer to processes triggered by the administration of two or more drugs leading to side effects beyond those observed when drugs are administered by themselves. Due to the massive number of possible drug pairs, it is nearly impossible to experimentally test all combinations and discover previously unobserved side effects. Therefore, machine learning based methods are being used to address this issue. Methods: We propose a Siamese self-attention multi-modal neural network for DDI prediction that integrates multiple drug similarity measures that have been derived from a comparison of drug characteristics including drug targets, pathways and gene expression profiles. Results: Our proposed DDI prediction model provides multiple advantages: 1) It is trained end-to-end, overcoming limitations of models composed of multiple separate steps, 2) it offers model explainability via an Attention mechanism for identifying salient input features and 3) it achieves similar or better prediction performance (AUPR scores ranging from 0.77 to 0.92) compared to state-of-the-art DDI models when tested on various benchmark datasets. Novel DDI predictions are further validated using independent data resources. Conclusions: We find that a Siamese multi-modal neural network is able to accurately predict DDIs and that an Attention mechanism, typically used in the Natural Language Processing domain, can be beneficially applied to aid in DDI model explainability.
LGMay 14, 2020
Patient Similarity Analysis with Longitudinal Health DataAhmed Allam, Matthias Dittberner, Anna Sintsova et al.
Healthcare professionals have long envisioned using the enormous processing powers of computers to discover new facts and medical knowledge locked inside electronic health records. These vast medical archives contain time-resolved information about medical visits, tests and procedures, as well as outcomes, which together form individual patient journeys. By assessing the similarities among these journeys, it is possible to uncover clusters of common disease trajectories with shared health outcomes. The assignment of patient journeys to specific clusters may in turn serve as the basis for personalized outcome prediction and treatment selection. This procedure is a non-trivial computational problem, as it requires the comparison of patient data with multi-dimensional and multi-modal features that are captured at different times and resolutions. In this review, we provide a comprehensive overview of the tools and methods that are used in patient similarity analysis with longitudinal data and discuss its potential for improving clinical decision making.
LGDec 30, 2019
AutoDiscern: Rating the Quality of Online Health Information with Hierarchical Encoder Attention-based Neural NetworksLaura Kinkead, Ahmed Allam, Michael Krauthammer
Patients increasingly turn to search engines and online content before, or in place of, talking with a health professional. Low quality health information, which is common on the internet, presents risks to the patient in the form of misinformation and a possibly poorer relationship with their physician. To address this, the DISCERN criteria (developed at University of Oxford) are used to evaluate the quality of online health information. However, patients are unlikely to take the time to apply these criteria to the health websites they visit. We built an automated implementation of the DISCERN instrument (Brief version) using machine learning models. We compared the performance of a traditional model (Random Forest) with that of a hierarchical encoder attention-based neural network (HEA) model using two language embeddings, BERT and BioBERT. The HEA BERT and BioBERT models achieved average F1-macro scores across all criteria of 0.75 and 0.74, respectively, outperforming the Random Forest model (average F1-macro = 0.69). Overall, the neural network based models achieved 81% and 86% average accuracy at 100% and 80% coverage, respectively, compared to 94% manual rating accuracy. The attention mechanism implemented in the HEA architectures not only provided 'model explainability' by identifying reasonable supporting sentences for the documents fulfilling the Brief DISCERN criteria, but also boosted F1 performance by 0.05 compared to the same architecture without an attention mechanism. Our research suggests that it is feasible to automate online health information quality assessment, which is an important step towards empowering patients to become informed partners in the healthcare process.
LGDec 22, 2018
Neural networks versus Logistic regression for 30 days all-cause readmission predictionAhmed Allam, Mate Nagy, George Thoma et al.
Heart failure (HF) is one of the leading causes of hospital admissions in the US. Readmission within 30 days after a HF hospitalization is both a recognized indicator for disease progression and a source of considerable financial burden to the healthcare system. Consequently, the identification of patients at risk for readmission is a key step in improving disease management and patient outcome. In this work, we used a large administrative claims dataset to (1)explore the systematic application of neural network-based models versus logistic regression for predicting 30 days all-cause readmission after discharge from a HF admission, and (2)to examine the additive value of patients' hospitalization timelines on prediction performance. Based on data from 272,778 (49% female) patients with a mean (SD) age of 73 years (14) and 343,328 HF admissions (67% of total admissions), we trained and tested our predictive readmission models following a stratified 5-fold cross-validation scheme. Among the deep learning approaches, a recurrent neural network (RNN) combined with conditional random fields (CRF) model (RNNCRF) achieved the best performance in readmission prediction with 0.642 AUC (95% CI, 0.640-0.645). Other models, such as those based on RNN, convolutional neural networks and CRF alone had lower performance, with a non-timeline based model (MLP) performing worst. A competitive model based on logistic regression with LASSO achieved a performance of 0.643 AUC (95%CI, 0.640-0.646). We conclude that data from patient timelines improve 30 day readmission prediction for neural network-based models, that a logistic regression with LASSO has equal performance to the best neural network model and that the use of administrative data result in competitive performance compared to published approaches based on richer clinical datasets.