MolRecBench-Wild: A Real-World Benchmark for Optical Chemical Structure Recognition
For researchers developing OCSR systems, this benchmark provides a more realistic evaluation that highlights the limitations of current models on real-world chemical diagrams.
The paper introduces MOSAIC, a dual-dimensional difficulty framework for OCSR, and constructs MolRecBench-Wild, a benchmark of 5,029 structures from 820 recent chemistry papers. Experiments on 18 models reveal severe performance degradation compared to previous patent benchmarks, exposing a large gap between real-world academic scenarios and existing evaluations.
Optical Chemical Structure Recognition (OCSR) aims to translate molecular diagrams in scientific literature into machine-readable formats, but current systems remain unreliable on real-world images due to substantial visual and chemical complexity. We introduce MOSAIC, a dual-dimensional difficulty framework with 37 fine-grained labels that jointly characterize visual interference and chemical semantic challenges in molecular diagrams. Based on this framework, we construct MolRecBench-Wild, a benchmark of 5,029 structures from 820 recent chemistry papers, covering the full difficulty spectrum observed in real publications. To enable faithful semantic evaluation beyond SMILES and MolFile, we propose CARBON, a representation language capable of expressing valence variations, icon-based groups, and other non-standard chemical semantics. We further adopt a dual-track evaluation protocol supporting both CARBON and SMILES outputs for broad model compatibility. Comprehensive experiments over 18 OCSR-capable models reveal severe performance degradation on MolRecBench-Wild, exposing a large gap between previous patent benchmarks and real-world academic scenarios.