TadA-Bench: A Million-Variant Benchmark for Future-Round Discovery Toward Agentic Protein Engineering

Jin Gao, Juntu Zhao, Zirui Zeng, Jiaqi Shen, Junhao Shi, Dukun Zhao, Yuming Lu, Dequan Wang

arXiv:2606.0262451.3

Predicted impact top 26% in QM · last 90 daysOriginality Incremental advance

AI Analysis

Provides a reproducible wet-lab replay benchmark for evaluating agentic protein engineering systems that must prioritize future experiments, addressing a key bottleneck in AI-driven directed evolution.

TadA-Bench introduces a million-variant benchmark from 31 directed-evolution rounds for testing AI models' ability to rank variants from future rounds, revealing that future-round prediction is much weaker than interpolation, with evolutionary coverage being more informative than local data density.

AI for scientific discovery is entering an agentic era, where protein-engineering systems are expected to prioritize future wet-lab experiments rather than merely fit static measurements. We introduce TadA-Bench, a million-variant wet-lab replay benchmark from 31 TadA directed-evolution rounds for future-round discovery toward agentic protein engineering. TadA-Bench preserves the campaign chronology and defines a fixed-data replay task: given earlier experimental rounds, models rank variants that appear only in later rounds. It provides aligned DNA, RNA, and protein views, and uses Seq2Graph, a graph-based label-unification pipeline, to reconcile noisy enrichment measurements into consistent cross-round activity labels. Random-split controls show strong interpolation, but future-round ranking and finite-budget candidate selection are much weaker. Controlled analyses suggest that evolutionary coverage is more informative than local data density, positioning TadA-Bench as a reproducible wet-lab replay substrate for future-round discovery toward agentic protein engineering; the data and code are released on Hugging Face and GitHub.

View on arXiv PDF

Similar