LGMar 24, 2023
Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment StudiesDavid Benrimoh, Akiva Kleinerman, Toshi A. Furukawa et al.
Major depressive disorder (MDD) is a heterogeneous condition; multiple underlying neurobiological substrates could be associated with treatment response variability. Understanding the sources of this variability and predicting outcomes has been elusive. Machine learning has shown promise in predicting treatment response in MDD, but one limitation has been the lack of clinical interpretability of machine learning models. We analyzed data from six clinical trials of pharmacological treatment for depression (total n = 5438) using the Differential Prototypes Neural Network (DPNN), a neural network model that derives patient prototypes which can be used to derive treatment-relevant patient clusters while learning to generate probabilities for differential treatment response. A model classifying remission and outputting individual remission probabilities for five first-line monotherapies and three combination treatments was trained using clinical and demographic data. Model validity and clinical utility were measured based on area under the curve (AUC) and expected improvement in sample remission rate with model-guided treatment, respectively. Post-hoc analyses yielded clusters (subgroups) based on patient prototypes learned during training. Prototypes were evaluated for interpretability by assessing differences in feature distributions and treatment-specific outcomes. A 3-prototype model achieved an AUC of 0.66 and an expected absolute improvement in population remission rate compared to the sample remission rate. We identified three treatment-relevant patient clusters which were clinically interpretable. It is possible to produce novel treatment-relevant patient profiles using machine learning models; doing so may improve precision medicine for depression. Note: This model is not currently the subject of any active clinical trials and is not intended for clinical use.
CLFeb 20
Predicting Contextual Informativeness for Vocabulary Learning using Deep LearningTao Wu, Adam Kapelner
We describe a modern deep learning system that automatically identifies informative contextual examples (\qu{contexts}) for first language vocabulary instruction for high school student. Our paper compares three modeling approaches: (i) an unsupervised similarity-based strategy using MPNet's uniformly contextualized embeddings, (ii) a supervised framework built on instruction-aware, fine-tuned Qwen3 embeddings with a nonlinear regression head and (iii) model (ii) plus handcrafted context features. We introduce a novel metric called the Retention Competency Curve to visualize trade-offs between the discarded proportion of good contexts and the \qu{good-to-bad} contexts ratio providing a compact, unified lens on model performance. Model (iii) delivers the most dramatic gains with performance of a good-to-bad ratio of 440 all while only throwing out 70\% of the good contexts. In summary, we demonstrate that a modern embedding model on neural network architecture, when guided by human supervision, results in a low-cost large supply of near-perfect contexts for teaching vocabulary for a variety of target words.
NCJun 7, 2024
Development and Validation of a Deep-Learning Model for Differential Treatment Benefit Prediction for Adults with Major Depressive Disorder Deployed in the Artificial Intelligence in Depression Medication Enhancement (AIDME) StudyDavid Benrimoh, Caitrin Armstrong, Joseph Mehltretter et al.
INTRODUCTION: The pharmacological treatment of Major Depressive Disorder (MDD) relies on a trial-and-error approach. We introduce an artificial intelligence (AI) model aiming to personalize treatment and improve outcomes, which was deployed in the Artificial Intelligence in Depression Medication Enhancement (AIDME) Study. OBJECTIVES: 1) Develop a model capable of predicting probabilities of remission across multiple pharmacological treatments for adults with at least moderate major depression. 2) Validate model predictions and examine them for amplification of harmful biases. METHODS: Data from previous clinical trials of antidepressant medications were standardized into a common framework and included 9,042 adults with moderate to severe major depression. Feature selection retained 25 clinical and demographic variables. Using Bayesian optimization, a deep learning model was trained on the training set, refined using the validation set, and tested once on the held-out test set. RESULTS: In the evaluation on the held-out test set, the model demonstrated achieved an AUC of 0.65. The model outperformed a null model on the test set (p = 0.01). The model demonstrated clinical utility, achieving an absolute improvement in population remission rate in hypothetical and actual improvement testing. While the model did identify one drug (escitalopram) as generally outperforming the other drugs (consistent with the input data), there was otherwise significant variation in drug rankings. On bias testing, the model did not amplify potentially harmful biases. CONCLUSIONS: We demonstrate the first model capable of predicting outcomes for 10 different treatment options for patients with MDD, intended to be used at or near the start of treatment to personalize treatment. The model was put into clinical practice during the AIDME randomized controlled trial whose results are reported separately.
MLDec 8, 2013
bartMachine: Machine Learning with Bayesian Additive Regression TreesAdam Kapelner, Justin Bleich
We present a new package in R implementing Bayesian additive regression trees (BART). The package introduces many new features for data analysis using BART such as variable selection, interaction detection, model diagnostic plots, incorporation of missing data and the ability to save trees for future prediction. It is significantly faster than the current R implementation, parallelized, and capable of handling both large sample sizes and high-dimensional data.
MLJun 3, 2013
Prediction with Missing Data via Bayesian Additive Regression TreesAdam Kapelner, Justin Bleich
We present a method for incorporating missing data in non-parametric statistical learning without the need for imputation. We focus on a tree-based method, Bayesian Additive Regression Trees (BART), enhanced with "Missingness Incorporated in Attributes," an approach recently proposed incorporating missingness into decision trees (Twala, 2008). This procedure takes advantage of the partitioning mechanisms found in tree-based models. Simulations on generated models and real data indicate that our proposed method can forecast well on complicated missing-at-random and not-missing-at-random models as well as models where missingness itself influences the response. Our procedure has higher predictive performance and is more stable than competitors in many cases. We also illustrate BART's abilities to incorporate missingness into uncertainty intervals and to detect the influence of missingness on the model fit.
OTOct 3, 2012
Breaking Monotony with Meaning: Motivation in Crowdsourcing MarketsDana Chandler, Adam Kapelner
We conduct the first natural field experiment to explore the relationship between the "meaningfulness" of a task and worker effort. We employed about 2,500 workers from Amazon's Mechanical Turk (MTurk), an online labor market, to label medical images. Although given an identical task, we experimentally manipulated how the task was framed. Subjects in the meaningful treatment were told that they were labeling tumor cells in order to assist medical researchers, subjects in the zero-context condition (the control group) were not told the purpose of the task, and, in stark contrast, subjects in the shredded treatment were not given context and were additionally told that their work would be discarded. We found that when a task was framed more meaningfully, workers were more likely to participate. We also found that the meaningful treatment increased the quantity of output (with an insignificant change in quality) while the shredded treatment decreased the quality of output (with no change in quantity). We believe these results will generalize to other short-term labor markets. Our study also discusses MTurk as an exciting platform for running natural field experiments in economics.