AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

Ramtin Babaeipour, François Charest, Madison Wright

arXiv:2602.0005225.3h-index: 4

AI Analysis

For clinical research coordinators and trial teams, this work addresses the burden of manual protocol extraction by demonstrating that AI assistance can improve accuracy and efficiency, though expert oversight remains necessary.

The authors developed an AI system using generative LLMs with Retrieval-Augmented Generation (RAG) for automated clinical trial protocol information extraction, achieving 89.0% accuracy compared to 62.6% for standalone LLMs, and reducing task completion time by 40% with lower cognitive demand and strong user preference.

Increasing clinical trial protocol complexity, amendments, and challenges around knowledge management create significant burden for trial teams. Structuring protocol content into standard formats has the potential to improve efficiency, support documentation quality, and strengthen compliance. We evaluate an Artificial Intelligence (AI) system using generative LLMs with Retrieval-Augmented Generation (RAG) for automated clinical trial protocol information extraction. We compare the extraction accuracy of our clinical-trial-specific RAG process against that of publicly available (standalone) LLMs. We also assess the operational impact of AI-assistance on simulated extraction Clinical Research Coordinator (CRC) workflows. Our RAG process shows higher extraction accuracy (89.0%) than standalone LLMs with fine-tuned prompts (62.6%) against expert-supported reference annotations. In simulated extraction workflows, AI-assisted tasks are completed 40% faster, are rated as less cognitively demanding and are strongly preferred by users. While expert oversight remains essential, this suggests that AI-assisted extraction can enable protocol intelligence at scale, motivating the integration of similar methodologies into real-world clinical workflows to further validate its impact on feasibility, study start-up, and post-activation monitoring.

View on arXiv PDF

Similar