CLOct 31, 2023
Keyword-optimized Template Insertion for Clinical Information Extraction via Prompt-based LearningEugenia Alleva, Isotta Landi, Leslee J Shaw et al.
Clinical note classification is a common clinical NLP task. However, annotated data-sets are scarse. Prompt-based learning has recently emerged as an effective method to adapt pre-trained models for text classification using only few training examples. A critical component of prompt design is the definition of the template (i.e. prompt text). The effect of template position, however, has been insufficiently investigated. This seems particularly important in the clinical setting, where task-relevant information is usually sparse in clinical notes. In this study we develop a keyword-optimized template insertion method (KOTI) and show how optimizing position can improve performance on several clinical tasks in a zero-shot and few-shot training setting.
CLSep 29, 2023
Clinical Text Deduplication Practices for Efficient Pretraining and Improved Clinical TasksIsotta Landi, Eugenia Alleva, Alissa A. Valentine et al.
Despite being a unique source of information on patients' status and disease progression, clinical notes are characterized by high levels of duplication and information redundancy. In general domain text, it has been shown that deduplication does not harm language model (LM) pretraining, thus helping reduce the training cost. Although large LMs have proven to learn medical knowledge, they still require specialized domain adaptation for improved downstream clinical tasks. By leveraging large real-world clinical corpora, we first provided a fine-grained characterization of duplicates stemming from common writing practices and clinical relevancy. Second, we demonstrated that deduplicating clinical text can help clinical LMs encode less redundant information in a more efficient manner and do not harm classification tasks via prompt-based learning.
CLMar 25, 2024Code
Extracting Social Support and Social Isolation Information from Clinical Psychiatry Notes: Comparing a Rule-based NLP System and a Large Language ModelBraja Gopal Patra, Lauren A. Lepow, Praneet Kasi Reddy Jagadeesh Kumar et al.
Background: Social support (SS) and social isolation (SI) are social determinants of health (SDOH) associated with psychiatric outcomes. In electronic health records (EHRs), individual-level SS/SI is typically documented as narrative clinical notes rather than structured coded data. Natural language processing (NLP) algorithms can automate the otherwise labor-intensive process of data extraction. Data and Methods: Psychiatric encounter notes from Mount Sinai Health System (MSHS, n=300) and Weill Cornell Medicine (WCM, n=225) were annotated and established a gold standard corpus. A rule-based system (RBS) involving lexicons and a large language model (LLM) using FLAN-T5-XL were developed to identify mentions of SS and SI and their subcategories (e.g., social network, instrumental support, and loneliness). Results: For extracting SS/SI, the RBS obtained higher macro-averaged f-scores than the LLM at both MSHS (0.89 vs. 0.65) and WCM (0.85 vs. 0.82). For extracting subcategories, the RBS also outperformed the LLM at both MSHS (0.90 vs. 0.62) and WCM (0.82 vs. 0.81). Discussion and Conclusion: Unexpectedly, the RBS outperformed the LLMs across all metrics. Intensive review demonstrates that this finding is due to the divergent approach taken by the RBS and LLM. The RBS were designed and refined to follow the same specific rules as the gold standard annotations. Conversely, the LLM were more inclusive with categorization and conformed to common English-language understanding. Both approaches offer advantages and are made available open-source for future testing.
AIJan 5, 2024Code
Natural Language Programming in Medicine: Administering Evidence Based Clinical Workflows with Autonomous Agents Powered by Generative Large Language ModelsAkhil Vaid, Joshua Lampert, Juhee Lee et al.
Generative Large Language Models (LLMs) hold significant promise in healthcare, demonstrating capabilities such as passing medical licensing exams and providing clinical knowledge. However, their current use as information retrieval tools is limited by challenges like data staleness, resource demands, and occasional generation of incorrect information. This study assessed the potential of LLMs to function as autonomous agents in a simulated tertiary care medical center, using real-world clinical cases across multiple specialties. Both proprietary and open-source LLMs were evaluated, with Retrieval Augmented Generation (RAG) enhancing contextual relevance. Proprietary models, particularly GPT-4, generally outperformed open-source models, showing improved guideline adherence and more accurate responses with RAG. The manual evaluation by expert clinicians was crucial in validating models' outputs, underscoring the importance of human oversight in LLM operation. Further, the study emphasizes Natural Language Programming (NLP) as the appropriate paradigm for modifying model behavior, allowing for precise adjustments through tailored prompts and real-world interactions. This approach highlights the potential of LLMs to significantly enhance and supplement clinical decision-making, while also emphasizing the value of continuous expert involvement and the flexibility of NLP to ensure their reliability and effectiveness in healthcare settings.
LGAug 27, 2020Code
reval: a Python package to determine best clustering solutions with stability-based relative clustering validationIsotta Landi, Veronica Mandelli, Michael V. Lombardo
Determining the best partition for a dataset can be a challenging task because of 1) the lack of a priori information within an unsupervised learning framework; and 2) the absence of a unique clustering validation approach to evaluate clustering solutions. Here we present reval: a Python package that leverages stability-based relative clustering validation methods to determine best clustering solutions as the ones that best generalize to unseen data. Statistical software, both in R and Python, usually rely on internal validation metrics, such as silhouette, to select the number of clusters that best fits the data. Meanwhile, open-source software solutions that easily implement relative clustering techniques are lacking. Internal validation methods exploit characteristics of the data itself to produce a result, whereas relative approaches attempt to leverage the unknown underlying distribution of data points looking for generalizable and replicable results. The implementation of relative validation methods can further the theory of clustering by enriching the already available methods that can be used to investigate clustering results in different situations and for different data distributions. This work aims at contributing to this effort by developing a stability-based method that selects the best clustering solution as the one that replicates, via supervised learning, on unseen subsets of data. The package works with multiple clustering and classification algorithms, hence allowing both the automatization of the labeling process and the assessment of the stability of different clustering mechanisms.
OTSep 2, 2025
Quantifying Clinician Bias and its Effects on Schizophrenia Diagnosis in the Emergency Department of the Mount Sinai Health SystemAlissa A. Valentine, Lauren A. Lepow, Lili Chan et al.
In the United States, schizophrenia (SCZ) carries a race and sex disparity that may be explained by clinician bias - a belief held by a clinician about a patient that prevents impartial clinical decision making. The emergency department (ED) is marked by higher rates of stress that lead to clinicians relying more on implicit biases during decision making. In this work, we considered a large cohort of psychiatric patients in the ED from the Mount Sinai Health System (MSHS) in New York City to investigate the effects of clinician bias on SCZ diagnosis while controlling for known risk factors and patient sociodemographic information. Clinician bias was quantified as the ratio of negative to total sentences within a patient's first ED note. We utilized a logistic regression to predict SCZ diagnosis given patient race, sex, age, history of trauma or substance use disorder, and the ratio of negative sentences. Our findings showed that an increased ratio of negative sentences is associated with higher odds of obtaining a SCZ diagnosis [OR (95% CI)=1.408 (1.361-1.456)]. Identifying as male [OR (95% CI)=1.112 (1.055-1.173)] or Black [OR (95% CI)=1.081(1.031-1.133)] increased one's odds of being diagnosed with SCZ. However, from an intersectional lens, Black female patients with high SES have the highest odds of obtaining a SCZ diagnosis [OR (95% CI)=1.629 (1.535-1.729)]. Results such as these suggest that SES does not act as a protective buffer against SCZ diagnosis in all patients, demanding more attention to the quantification of health disparities. Lastly, we demonstrated that clinician bias is operational with real world data and related to increased odds of obtaining a stigmatizing diagnosis such as SCZ.
QMMar 14, 2020
Deep Representation Learning of Electronic Health Records to Unlock Patient Stratification at ScaleIsotta Landi, Benjamin S. Glicksberg, Hao-Chih Lee et al.
Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here we present an unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. We considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising of a total of 57,464 clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks, and autoencoders (i.e., ConvAE) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. ConvAE significantly outperformed several baselines in a clustering task to identify patients with different complex conditions, with 2.61 entropy and 0.31 purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson's disease and Alzheimer's disease, largely related to comorbidities, disease progression, and symptom severity. With these results, we demonstrate that ConvAE can generate patient representations that lead to clinically meaningful insights. This scalable framework can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.
QMOct 16, 2017
Convolutional neural networks for structured omics: OmicsCNN and the OmicsConv layerGiuseppe Jurman, Valerio Maggio, Diego Fioravanti et al.
Convolutional Neural Networks (CNNs) are a popular deep learning architecture widely applied in different domains, in particular in classifying over images, for which the concept of convolution with a filter comes naturally. Unfortunately, the requirement of a distance (or, at least, of a neighbourhood function) in the input feature space has so far prevented its direct use on data types such as omics data. However, a number of omics data are metrizable, i.e., they can be endowed with a metric structure, enabling to adopt a convolutional based deep learning framework, e.g., for prediction. We propose a generalized solution for CNNs on omics data, implemented through a dedicated Keras layer. In particular, for metagenomics data, a metric can be derived from the patristic distance on the phylogenetic tree. For transcriptomics data, we combine Gene Ontology semantic similarity and gene co-expression to define a distance; the function is defined through a multilayer network where 3 layers are defined by the GO mutual semantic similarity while the fourth one by gene co-expression. As a general tool, feature distance on omics data is enabled by OmicsConv, a novel Keras layer, obtaining OmicsCNN, a dedicated deep learning framework. Here we demonstrate OmicsCNN on gut microbiota sequencing data, for Inflammatory Bowel Disease (IBD) 16S data, first on synthetic data and then a metagenomics collection of gut microbiota of 222 IBD patients.