PlotPick: AI-powered batch extraction of numerical data from scientific figures
For researchers conducting systematic reviews and meta-analyses, PlotPick provides a scalable alternative to manual digitisation of figure data.
PlotPick uses vision-language models to batch-extract numerical data from scientific figures, outperforming the dedicated model DePlot with 88-96% recall on ChartX (vs. 71%) and 86-99% RMSF1 on PlotQA (vs. 94%).
Systematic reviews and meta-analyses frequently require numerical data that authors report only as figures, yet manual digitisation is slow and does not scale. We present PlotPick, an open-source tool that uses vision-language models (VLMs) to batch-extract structured tabular data from scientific figures. We evaluate six VLMs from three providers on two established chart-to-table benchmarks (ChartX and PlotQA) and compare against the dedicated chart-to-table model DePlot. All six VLMs outperform DePlot on both benchmarks. On ChartX (restricted to bar charts, line charts, box plots, and histograms; n=300), VLMs achieve 88-96% recall versus 71% for DePlot. On PlotQA (n=529), VLMs achieve 86-99% RMSF1 versus 94% for DePlot. The gap is largest on chart types absent from the dedicated models' training data: on box plots, DePlot achieves 24% RMSF1 while VLMs achieve 83-97%. PlotPick is available at https://plotpick.streamlit.app.