Intelligent Documentation in Medical Education: Can AI Replace Manual Case Logging?
This addresses the clerical burden for medical trainees, though it is incremental as it applies existing AI methods to a new domain-specific task.
This study tackled the problem of time-consuming and inconsistent manual case logging in radiology training by evaluating large language models (LLMs) to automate documentation from free-text reports, achieving best F1-scores of up to 0.87 on 414 reports.
Procedural case logs are a core requirement in radiology training, yet they are time-consuming to complete and prone to inconsistency when authored manually. This study investigates whether large language models (LLMs) can automate procedural case log documentation directly from free-text radiology reports. We evaluate multiple local and commercial LLMs under instruction-based and chain-of-thought prompting to extract structured procedural information from 414 curated interventional radiology reports authored by nine residents between 2018 and 2024. Model performance is assessed using sensitivity, specificity, and F1-score, alongside inference latency and token efficiency to estimate operational cost. Results show that both local and commercial models achieve strong extraction performance, with best F1-scores approaching 0.87, while exhibiting different trade-offs between speed and cost. Automation using LLMs has the potential to substantially reduce clerical burden for trainees and improve consistency in case logging. These findings demonstrate the feasibility of AI-assisted documentation in medical education and highlight the need for further validation across institutions and clinical workflows.