CL CR IR LGMar 21, 2025

Can Zero-Shot Commercial APIs Deliver Regulatory-Grade Clinical Text DeIdentification?

Veysel Kocaman, Muhammed Santas, Yigit Gul, Mehmet Butgul, David Talby

arXiv:2503.20794v24.94 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This addresses the need for cost-effective and accurate de-identification of medical text to comply with regulations like HIPAA, though it is incremental as it compares existing solutions rather than introducing new methods.

The study evaluated four commercial APIs for de-identifying clinical text, finding that John Snow Labs achieved the highest accuracy with a 96% F1-score in PHI detection, surpassing regulatory-grade standards and being over 80% cheaper than some competitors.

We evaluate the performance of four leading solutions for de-identification of unstructured medical text - Azure Health Data Services, AWS Comprehend Medical, OpenAI GPT-4o, and John Snow Labs - on a ground truth dataset of 48 clinical documents annotated by medical experts. The analysis, conducted at both entity-level and token-level, suggests that John Snow Labs' Medical Language Models solution achieves the highest accuracy, with a 96% F1-score in protected health information (PHI) detection, outperforming Azure (91%), AWS (83%), and GPT-4o (79%). John Snow Labs is not only the only solution which achieves regulatory-grade accuracy (surpassing that of human experts) but is also the most cost-effective solution: It is over 80% cheaper compared to Azure and GPT-4o, and is the only solution not priced by token. Its fixed-cost local deployment model avoids the escalating per-request fees of cloud-based services, making it a scalable and economical choice.

View on arXiv PDF

Similar