CLJul 11, 2025

Application of CARE-SD text classifier tools to assess distribution of stigmatizing and doubt-marking language features in EHR

arXiv:2507.08969v1h-index: 7
Originality Synthesis-oriented
AI Analysis

This addresses the issue of perpetuated patient stigmatization in healthcare settings, but it is incremental as it applies existing tools to new data.

The study tackled the problem of stigmatizing and doubt-marking language in electronic health records (EHR) by applying text classifier tools to MIMIC-III data, finding higher rates of such language among Black or African American patients (RR: 1.16), those with government insurance (RR: 2.46), and specific provider types like social workers (RR: 2.25).

Introduction: Electronic health records (EHR) are a critical medium through which patient stigmatization is perpetuated among healthcare teams. Methods: We identified linguistic features of doubt markers and stigmatizing labels in MIMIC-III EHR via expanded lexicon matching and supervised learning classifiers. Predictors of rates of linguistic features were assessed using Poisson regression models. Results: We found higher rates of stigmatizing labels per chart among patients who were Black or African American (RR: 1.16), patients with Medicare/Medicaid or government-run insurance (RR: 2.46), self-pay (RR: 2.12), and patients with a variety of stigmatizing disease and mental health conditions. Patterns among doubt markers were similar, though male patients had higher rates of doubt markers (RR: 1.25). We found increased stigmatizing labels used by nurses (RR: 1.40), and social workers (RR: 2.25), with similar patterns of doubt markers. Discussion: Stigmatizing language occurred at higher rates among historically stigmatized patients, perpetuated by multiple provider types.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes