Prompt-based Extraction of Social Determinants of Health Using Few-shot Learning
This work addresses the extraction of SDOH for healthcare research, but it is incremental as it applies an existing method (GPT-4 prompting) to a specific dataset.
The paper tackled the problem of automatically extracting social determinants of health (SDOH) from unstructured text in electronic health records using GPT-4 with one-shot prompting, achieving an overall F1 score of 0.652 on the SHAC test set, which is comparable to the 7th best system in the n2c2 challenge.
Social determinants of health (SDOH) documented in the electronic health record through unstructured text are increasingly being studied to understand how SDOH impacts patient health outcomes. In this work, we utilize the Social History Annotation Corpus (SHAC), a multi-institutional corpus of de-identified social history sections annotated for SDOH, including substance use, employment, and living status information. We explore the automatic extraction of SDOH information with SHAC in both standoff and inline annotation formats using GPT-4 in a one-shot prompting setting. We compare GPT-4 extraction performance with a high-performing supervised approach and perform thorough error analyses. Our prompt-based GPT-4 method achieved an overall 0.652 F1 on the SHAC test set, similar to the 7th best-performing system among all teams in the n2c2 challenge with SHAC.