CLNov 4, 2025

ROBoto2: An Interactive System and Dataset for LLM-assisted Clinical Trial Risk of Bias Assessment

arXiv:2511.03048v11 citationsh-index: 2Has CodeEMNLP
Originality Synthesis-oriented
AI Analysis

This addresses the problem of automating systematic reviews for researchers and clinicians, but it is incremental as it builds on existing methods with a new dataset and interface.

The authors tackled the labor-intensive process of risk of bias assessment in clinical trials by developing ROBOTO2, an interactive system that uses LLMs to assist with annotations, resulting in a dataset of 521 trials and benchmarking showing current model capabilities and challenges.

We present ROBOTO2, an open-source, web-based platform for large language model (LLM)-assisted risk of bias (ROB) assessment of clinical trials. ROBOTO2 streamlines the traditionally labor-intensive ROB v2 (ROB2) annotation process via an interactive interface that combines PDF parsing, retrieval-augmented LLM prompting, and human-in-the-loop review. Users can upload clinical trial reports, receive preliminary answers and supporting evidence for ROB2 signaling questions, and provide real-time feedback or corrections to system suggestions. ROBOTO2 is publicly available at https://roboto2.vercel.app/, with code and data released to foster reproducibility and adoption. We construct and release a dataset of 521 pediatric clinical trial reports (8954 signaling questions with 1202 evidence passages), annotated using both manually and LLM-assisted methods, serving as a benchmark and enabling future research. Using this dataset, we benchmark ROB2 performance for 4 LLMs and provide an analysis into current model capabilities and ongoing challenges in automating this critical aspect of systematic review.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes