Adler Perotte

h-index26

11papers

482citations

Novelty57%

AI Score32

Ranked #124,399 of 194,257 authors (top 64%)#27,357 in LG (top 68%)

11 Papers

1.2SPApr 30, 2022

Electrocardiographic Deep Learning for Predicting Post-Procedural Mortality

David Ouyang, John Theurer, Nathan R. Stein et al.

Background. Pre-operative risk assessments used in clinical practice are limited in their ability to identify risk for post-operative mortality. We hypothesize that electrocardiograms contain hidden risk markers that can help prognosticate post-operative mortality. Methods. In a derivation cohort of 45,969 pre-operative patients (age 59+- 19 years, 55 percent women), a deep learning algorithm was developed to leverage waveform signals from pre-operative ECGs to discriminate post-operative mortality. Model performance was assessed in a holdout internal test dataset and in two external hospital cohorts and compared with the Revised Cardiac Risk Index (RCRI) score. Results. In the derivation cohort, there were 1,452 deaths. The algorithm discriminates mortality with an AUC of 0.83 (95% CI 0.79-0.87) surpassing the discrimination of the RCRI score with an AUC of 0.67 (CI 0.61-0.72) in the held out test cohort. Patients determined to be high risk by the deep learning model's risk prediction had an unadjusted odds ratio (OR) of 8.83 (5.57-13.20) for post-operative mortality as compared to an unadjusted OR of 2.08 (CI 0.77-3.50) for post-operative mortality for RCRI greater than 2. The deep learning algorithm performed similarly for patients undergoing cardiac surgery with an AUC of 0.85 (CI 0.77-0.92), non-cardiac surgery with an AUC of 0.83 (0.79-0.88), and catherization or endoscopy suite procedures with an AUC of 0.76 (0.72-0.81). The algorithm similarly discriminated risk for mortality in two separate external validation cohorts from independent healthcare systems with AUCs of 0.79 (0.75-0.83) and 0.75 (0.74-0.76) respectively. Conclusion. The findings demonstrate how a novel deep learning algorithm, applied to pre-operative ECGs, can improve discrimination of post-operative mortality.

2.0LGNov 2, 2023

Accurate and Scalable Stochastic Gaussian Process Regression via Learnable Coreset-based Variational Inference

Mert Ketenci, Adler Perotte, Noémie Elhadad et al.

We introduce a novel stochastic variational inference method for Gaussian process ($\mathcal{GP}$) regression, by deriving a posterior over a learnable set of coresets: i.e., over pseudo-input/output, weighted pairs. Unlike former free-form variational families for stochastic inference, our coreset-based variational $\mathcal{GP}$ (CVGP) is defined in terms of the $\mathcal{GP}$ prior and the (weighted) data likelihood. This formulation naturally incorporates inductive biases of the prior, and ensures its kernel and likelihood dependencies are shared with the posterior. We derive a variational lower-bound on the log-marginal likelihood by marginalizing over the latent $\mathcal{GP}$ coreset variables, and show that CVGP's lower-bound is amenable to stochastic optimization. CVGP reduces the dimensionality of the variational parameter search space to linear $\mathcal{O}(M)$ complexity, while ensuring numerical stability at $\mathcal{O}(M^3)$ time complexity and $\mathcal{O}(M^2)$ space complexity.

2.6LGFeb 6, 2024

Probabilistic Shapley Value Modeling and Inference

Mert Ketenci, Iñigo Urteaga, Victor Alfonso Rodriguez et al.

We propose probabilistic Shapley inference (PSI), a novel probabilistic framework to model and infer sufficient statistics of feature attributions in flexible predictive models, via latent random variables whose mean recovers Shapley values. PSI enables efficient, scalable inference over input-to-output attributions, and their uncertainty, via a variational objective that jointly trains a predictive (regression or classification) model and its attribution distributions. To address the challenge of marginalizing over variable-length input feature subsets in Shapley value calculation, we introduce a masking-based neural network architecture, with a modular training and inference procedure. We evaluate PSI on synthetic and real-world datasets, showing that it achieves competitive predictive performance compared to strong baselines, while learning feature attribution distributions -- centered at Shapley values -- that reveal meaningful attribution uncertainty across data modalities.

23.4LGNov 10, 2021

CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks

Chao Pang, Xinzhuo Jiang, Krishna S Kalluri et al.

Embedding algorithms are increasingly used to represent clinical concepts in healthcare for improving machine learning tasks such as clinical phenotyping and disease prediction. Recent studies have adapted state-of-the-art bidirectional encoder representations from transformers (BERT) architecture to structured electronic health records (EHR) data for the generation of contextualized concept embeddings, yet do not fully incorporate temporal data across multiple clinical domains. Therefore we developed a new BERT adaptation, CEHR-BERT, to incorporate temporal information using a hybrid approach by augmenting the input to BERT using artificial time tokens, incorporating time, age, and concept embeddings, and introducing a new second learning objective for visit type. CEHR-BERT was trained on a subset of Columbia University Irving Medical Center-York Presbyterian Hospital's clinical data, which includes 2.4M patients, spanning over three decades, and tested using 4-fold cross-validation on the following prediction tasks: hospitalization, death, new heart failure (HF) diagnosis, and HF readmission. Our experiments show that CEHR-BERT outperformed existing state-of-the-art clinical BERT adaptations and baseline models across all 4 prediction tasks in both ROC-AUC and PR-AUC. CEHR-BERT also demonstrated strong transfer learning capability, as our model trained on only 5% of data outperformed comparison models trained on the entire data set. Ablation studies to better understand the contribution of each time component showed incremental gains with every element, suggesting that CEHR-BERT's incorporation of artificial time tokens, time and age embeddings with concept embeddings, and the addition of the second learning objective represents a promising approach for future BERT-based clinical embeddings.

7.3MEFeb 17, 2021

Causal Estimation with Functional Confounders

Aahlad Puli, Adler J. Perotte, Rajesh Ranganath

Causal inference relies on two fundamental assumptions: ignorability and positivity. We study causal inference when the true confounder value can be expressed as a function of the observed data; we call this setting estimation with functional confounders (EFC). In this setting, ignorability is satisfied, however positivity is violated, and causal inference is impossible in general. We consider two scenarios where causal effects are estimable. First, we discuss interventions on a part of the treatment called functional interventions and a sufficient condition for effect estimation of these interventions called functional positivity. Second, we develop conditions for nonparametric effect estimation based on the gradient fields of the functional confounder and the true outcome function. To estimate effects under these conditions, we develop Level-set Orthogonal Descent Estimation (LODE). Further, we prove error bounds on LODE's effect estimates, evaluate our methods on simulated and real data, and empirically demonstrate the value of EFC.

18.6LGJan 13, 2021Code

X-CAL: Explicit Calibration for Survival Analysis

Mark Goldstein, Xintian Han, Aahlad Puli et al.

Survival analysis models the distribution of time until an event of interest, such as discharge from the hospital or admission to the ICU. When a model's predicted number of events within any time interval is similar to the observed number, it is called well-calibrated. A survival model's calibration can be measured using, for instance, distributional calibration (D-CALIBRATION) [Haider et al., 2020] which computes the squared difference between the observed and predicted number of events within different time intervals. Classically, calibration is addressed in post-training analysis. We develop explicit calibration (X-CAL), which turns D-CALIBRATION into a differentiable objective that can be used in survival modeling alongside maximum likelihood estimation and other objectives. X-CAL allows practitioners to directly optimize calibration and strike a desired balance between predictive power and calibration. In our experiments, we fit a variety of shallow and deep models on simulated data, a survival dataset based on MNIST, on length-of-stay prediction using MIMIC-III data, and on brain cancer data from The Cancer Genome Atlas. We show that the models we study can be miscalibrated. We give experimental evidence on these datasets that X-CAL improves D-CALIBRATION without a large decrease in concordance or likelihood.

1.1CLSep 29, 2020Code

Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells

Griffin Adams, Mert Ketenci, Shreyas Bhave et al.

We introduce Latent Meaning Cells, a deep latent variable model which learns contextualized representations of words by combining local lexical context and metadata. Metadata can refer to granular context, such as section type, or to more global context, such as unique document ids. Reliance on metadata for contextualized representation learning is apropos in the clinical domain where text is semi-structured and expresses high variation in topics. We evaluate the LMC model on the task of zero-shot clinical acronym expansion across three datasets. The LMC significantly outperforms a diverse set of baselines at a fraction of the pre-training cost and learns clinically coherent representations. We demonstrate that not only is metadata itself very helpful for the task, but that the LMC inference algorithm provides an additional large benefit.

4.2LGJan 9, 2020

The Counterfactual $χ$-GAN

Amelia J. Averitt, Natnicha Vanitchanant, Rajesh Ranganath et al.

Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome, known as strong ignorability. Approaches to enforcing strong ignorability in causal analyses of observational data include weighting and matching methods. Effect estimates, such as the average treatment effect (ATE), are then estimated as expectations under the reweighted or matched distribution, P . The choice of P is important and can impact the interpretation of the effect estimate and the variance of effect estimates. In this work, instead of specifying P, we learn a distribution that simultaneously maximizes coverage and minimizes variance of ATE estimates. In order to learn this distribution, this research proposes a generative adversarial network (GAN)-based model called the Counterfactual $χ$-GAN (cGAN), which also learns feature-balancing weights and supports unbiased causal estimation in the absence of unobserved confounding. Our model minimizes the Pearson $χ^2$ divergence, which we show simultaneously maximizes coverage and minimizes the variance of importance sampling estimates. To our knowledge, this is the first such application of the Pearson $χ^2$ divergence. We demonstrate the effectiveness of cGAN in achieving feature balance relative to established weighting methods in simulation and with real-world medical data.

2.9LGDec 7, 2018

Phenotype Inference with Semi-Supervised Mixed Membership Models

Victor Rodriguez, Adler Perotte

Disease phenotyping algorithms process observational clinical data to identify patients with specific diseases. Supervised phenotyping methods require significant quantities of expert-labeled data, while unsupervised methods may learn non-disease phenotypes. To address these limitations, we propose the Semi-Supervised Mixed Membership Model (SS3M) -- a probabilistic graphical model for learning disease phenotypes from clinical data with relatively few labels. We show SS3M can learn interpretable, disease-specific phenotypes which capture the clinical characteristics of the diseases specified by the labels provided.

16.9MLMay 21, 2018

Multiple Causal Inference with Latent Confounding

Rajesh Ranganath, Adler Perotte

Causal inference from observational data requires assumptions. These assumptions range from measuring confounders to identifying instruments. Traditionally, causal inference assumptions have focused on estimation of effects for a single treatment. In this work, we construct techniques for estimation with multiple treatments in the presence of unobserved confounding. We develop two assumptions based on shared confounding between treatments and independence of treatments given the confounder. Together, these assumptions lead to a confounder estimator regularized by mutual information. For this estimator, we develop a tractable lower bound. To recover treatment effects, we use the residual information in the treatments independent of the confounder. We validate on simulations and an example from clinical medicine.

28.4MLAug 6, 2016

Deep Survival Analysis

Rajesh Ranganath, Adler Perotte, Noémie Elhadad et al.

The electronic health record (EHR) provides an unprecedented opportunity to build actionable tools to support physicians at the point of care. In this paper, we investigate survival analysis in the context of EHR data. We introduce deep survival analysis, a hierarchical generative approach to survival analysis. It departs from previous approaches in two primary ways: (1) all observations, including covariates, are modeled jointly conditioned on a rich latent structure; and (2) the observations are aligned by their failure time, rather than by an arbitrary time zero as in traditional survival analysis. Further, it (3) scalably handles heterogeneous (continuous and discrete) data types that occur in the EHR. We validate deep survival analysis model by stratifying patients according to risk of developing coronary heart disease (CHD). Specifically, we study a dataset of 313,000 patients corresponding to 5.5 million months of observations. When compared to the clinically validated Framingham CHD risk score, deep survival analysis is significantly superior in stratifying patients according to their risk.