IRLGMar 6

OpenExtract: Automated Data Extraction for Systematic Reviews in Health

arXiv:2603.13338h-index: 6Has Code
AI Analysis

This addresses the efficiency challenge for researchers conducting systematic reviews, though it appears incremental as it applies existing LLM methods to a specific domain.

The study tackled the problem of automating data extraction for systematic reviews in health by developing OpenExtract, an open-source pipeline using LLMs, and achieved precision and recall scores > 0.8 in a digital health review.

This study presents OpenExtract, an open-source pipeline for automated data extraction in large-scale systematic literature reviews. The pipeline queries large language models (LLMs) to predict data entries based on relevant sections of scientific articles. To test the efficacy of OpenExtract, we apply it to a systematic literature review in digital health and compare its outputs with those of human researchers. OpenExtract achieves precision and recall scores of > 0.8 in this task, indicating that it can be effective at extracting data automatically and efficiently. OpenExtract: https://github.com/JimAchterbergLUMC/OpenExtract.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes