CLNov 22, 2023

Surpassing GPT-4 Medical Coding with a Two-Stage Approach

Zhichao Yang, Sanjit Singh Batra, Joel Stremmel, Eran Halperin

arXiv:2311.13735v12.914 citationsh-index: 13

Originality Incremental advance

AI Analysis

This addresses the critical need for precise automated medical coding in healthcare, representing a strong domain-specific improvement over existing methods.

The paper tackles the problem of GPT-4's low precision in medical coding by introducing LLM-codex, a two-stage approach that first generates evidence proposals with an LLM and then verifies them with an LSTM, achieving state-of-the-art results in accuracy, rare code accuracy, and evidence identification on the MIMIC dataset.

Recent advances in large language models (LLMs) show potential for clinical applications, such as clinical decision support and trial recommendations. However, the GPT-4 LLM predicts an excessive number of ICD codes for medical coding tasks, leading to high recall but low precision. To tackle this challenge, we introduce LLM-codex, a two-stage approach to predict ICD codes that first generates evidence proposals using an LLM and then employs an LSTM-based verification stage. The LSTM learns from both the LLM's high recall and human expert's high precision, using a custom loss function. Our model is the only approach that simultaneously achieves state-of-the-art results in medical coding accuracy, accuracy on rare codes, and sentence-level evidence identification to support coding decisions without training on human-annotated evidence according to experiments on the MIMIC dataset.

View on arXiv PDF

Similar