CLSep 11, 2023

Zero-shot Learning with Minimum Instruction to Extract Social Determinants and Family History from Clinical Notes using GPT Model

Neel Bhate, Ansh Mittal, Zhe He, Xiao Luo

arXiv:2309.05475v22.518 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses the problem of efficiently extracting key unstructured health information for healthcare researchers and practitioners, but it is incremental as it builds on existing GPT applications with a focus on zero-shot learning.

The study tackled extracting demographics, social determinants of health, and family history from clinical notes using GPT models in a zero-shot learning setup with minimal instructions, achieving average F1 scores of 0.975, 0.615, and 0.722 respectively.

Demographics, Social determinants of health, and family history documented in the unstructured text within the electronic health records are increasingly being studied to understand how this information can be utilized with the structured data to improve healthcare outcomes. After the GPT models were released, many studies have applied GPT models to extract this information from the narrative clinical notes. Different from the existing work, our research focuses on investigating the zero-shot learning on extracting this information together by providing minimum information to the GPT model. We utilize de-identified real-world clinical notes annotated for demographics, various social determinants, and family history information. Given that the GPT model might provide text different from the text in the original data, we explore two sets of evaluation metrics, including the traditional NER evaluation metrics and semantic similarity evaluation metrics, to completely understand the performance. Our results show that the GPT-3.5 method achieved an average of 0.975 F1 on demographics extraction, 0.615 F1 on social determinants extraction, and 0.722 F1 on family history extraction. We believe these results can be further improved through model fine-tuning or few-shots learning. Through the case studies, we also identified the limitations of the GPT models, which need to be addressed in future research.

View on arXiv PDF

Similar