AIMay 27

MACReD: A Multi-Agent Collaborative Reasoning Framework for Reaction Diagram Parsing

Chuang Tang, Chenhao Lin, Yin Xu, Hao Wang, Jinrui Zhou, Xin Li, Mingjun Xiao, Enhong Chen

arXiv:2605.2807759.8h-index: 4

AI Analysis

For researchers in chemistry and document analysis, this work provides a robust method to automatically extract reaction information from complex diagrams, addressing a key bottleneck in chemical literature mining.

MACReD introduces a multi-agent framework for parsing chemical reaction diagrams, achieving state-of-the-art F1 scores of 75.2% (hard) and 84.6% (soft) on the RxnScribe benchmark, outperforming the previous best by 6.1% and 4.6% respectively.

Parsing chemical reaction diagrams from scientific literature is challenging due to heterogeneous layouts, intertwined visual elements, and the difficulty of integrating recognition and reasoning. Existing vision-language models advance multimodal understanding but still fail on complex diagrams, struggling to maintain spatial coherence and to integrate multidimensional information during reasoning. To address these issues, we propose MACReD, a hierarchical multi-agent framework that coordinates specialized agents for molecular perception, arrow understanding, text extraction, and reaction reconstruction within a unified VLM-guided architecture. The planning and perception layers use flexible, fine-grained detection to handle visual complexity, while the reasoning layer uses a multigraph fusion mechanism to integrate heterogeneous cues and enforce chemically consistent global reasoning. Experiments on the RxnScribe benchmark show that MACReD achieves state-of-the-art performance, with F1 scores of 75.2% and 84.6% under hard and soft match criteria, outperforming the RxnScribe baseline, which obtains 69.1% and 80.0%, respectively. These results demonstrate the robustness of MACReD across diverse diagram layouts, including multi-step and tree-structured reactions.

View on arXiv PDF

Similar