LG QMJan 30, 2024

ReacLLaMA: Merging chemical and textual information in chemical reactivity AI models

Aline Hartgers, Ramil Nugmanov, Kostiantyn Chernichenko, Joerg Kurt Wegner

arXiv:2401.17267v12.61 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the need for more accurate chemical reactivity models by incorporating textual protocol details, though it is incremental as it builds on existing methods.

The authors tackled the problem of predicting chemical reaction outcomes by augmenting a Graphormer reactivity model with procedural text, resulting in improved specificity for identifying unpromising reactions.

Chemical reactivity models are developed to predict chemical reaction outcomes in the form of classification (success/failure) or regression (product yield) tasks. The vast majority of the reported models are trained solely on chemical information such as reactants, products, reagents, and solvents, but not on the details of a synthetic protocol. Herein incorporation of procedural text with the aim to augment the Graphormer reactivity model and improve its accuracy is presented. Two major approaches are used: training an adapter Graphormer model that is provided with a GPT-2-derived latent representation of the text procedure (ReacLLaMA-Adapter) and labeling an unlabeled part of a dataset with the LLaMA 2 model followed by training the Graphormer on an extended dataset (Zero-Shot Labeling ReacLLaMA). Both methodologies enhance the discernment of unpromising reactions, thereby providing more accurate models with improved specificity.

View on arXiv PDF

Similar