Autodiscover: A reinforcement learning recommendation system for the cold-start imbalance challenge in active learning, powered by graph-aware thompson sampling

arXiv:2602.05087v1Has Code

Originality Incremental advance

AI Analysis

This work addresses the bottleneck of manual screening in evidence-based research by improving active learning efficiency under low prevalence and scarce expert labels, though it is incremental as it builds on existing methods with adaptive enhancements.

The paper tackles the cold-start imbalance challenge in active learning for systematic literature reviews by introducing AutoDiscover, a reinforcement learning recommendation system that dynamically manages query strategies using graph-aware Thompson sampling, achieving higher screening efficiency on the SYNERGY benchmark and mitigating cold start with minimal initial labels.

Systematic literature reviews (SLRs) are fundamental to evidence-based research, but manual screening is an increasing bottleneck as scientific output grows. Screening features low prevalence of relevant studies and scarce, costly expert decisions. Traditional active learning (AL) systems help, yet typically rely on fixed query strategies for selecting the next unlabeled documents. These static strategies do not adapt over time and ignore the relational structure of scientific literature networks. This thesis introduces AutoDiscover, a framework that reframes AL as an online decision-making problem driven by an adaptive agent. Literature is modeled as a heterogeneous graph capturing relationships among documents, authors, and metadata. A Heterogeneous Graph Attention Network (HAN) learns node representations, which a Discounted Thompson Sampling (DTS) agent uses to dynamically manage a portfolio of query strategies. With real-time human-in-the-loop labels, the agent balances exploration and exploitation under non-stationary review dynamics, where strategy utility changes over time. On the 26-dataset SYNERGY benchmark, AutoDiscover achieves higher screening efficiency than static AL baselines. Crucially, the agent mitigates cold start by bootstrapping discovery from minimal initial labels where static approaches fail. We also introduce TS-Insight, an open-source visual analytics dashboard to interpret, verify, and diagnose the agent's decisions. Together, these contributions accelerate SLR screening under scarce expert labels and low prevalence of relevant studies.

View on arXiv PDF

Similar