LGCLCYDec 14, 2025

Social Determinants of Health Prediction for ICD-9 Code with Reasoning Models

arXiv:2601.09709v1
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of extracting social determinants from clinical text for healthcare professionals, but it is incremental as it builds on existing methods for a specific dataset.

The paper tackled the problem of predicting Social Determinants of Health ICD-9 codes from hospital admissions, achieving an 89% F1 score on the MIMIC-III dataset.

Social Determinants of Health correlate with patient outcomes but are rarely captured in structured data. Recent attention has been given to automatically extracting these markers from clinical text to supplement diagnostic systems with knowledge of patients' social circumstances. Large language models demonstrate strong performance in identifying Social Determinants of Health labels from sentences. However, prediction in large admissions or longitudinal notes is challenging given long distance dependencies. In this paper, we explore hospital admission multi-label Social Determinants of Health ICD-9 code classification on the MIMIC-III dataset using reasoning models and traditional large language models. We exploit existing ICD-9 codes for prediction on admissions, which achieved an 89% F1. Our contributions include our findings, missing SDoH codes in 139 admissions, and code to reproduce the results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes