CLJan 26, 2024

T-Rex: Text-assisted Retrosynthesis Prediction

arXiv:2401.14637v12 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of improving retrosynthesis prediction for computational chemists by integrating text information from language models, representing an incremental advancement over existing template-free approaches.

The paper tackles retrosynthesis prediction in computational chemistry by proposing T-Rex, which uses ChatGPT to generate text descriptions of molecules and assist in identifying reactants, resulting in substantial performance improvements over graph-based state-of-the-art methods on two datasets.

As a fundamental task in computational chemistry, retrosynthesis prediction aims to identify a set of reactants to synthesize a target molecule. Existing template-free approaches only consider the graph structures of the target molecule, which often cannot generalize well to rare reaction types and large molecules. Here, we propose T-Rex, a text-assisted retrosynthesis prediction approach that exploits pre-trained text language models, such as ChatGPT, to assist the generation of reactants. T-Rex first exploits ChatGPT to generate a description for the target molecule and rank candidate reaction centers based both the description and the molecular graph. It then re-ranks these candidates by querying the descriptions for each reactants and examines which group of reactants can best synthesize the target molecule. We observed that T-Rex substantially outperformed graph-based state-of-the-art approaches on two datasets, indicating the effectiveness of considering text information. We further found that T-Rex outperformed the variant that only use ChatGPT-based description without the re-ranking step, demonstrate how our framework outperformed a straightforward integration of ChatGPT and graph information. Collectively, we show that text generated by pre-trained language models can substantially improve retrosynthesis prediction, opening up new avenues for exploiting ChatGPT to advance computational chemistry. And the codes can be found at https://github.com/lauyikfung/T-Rex.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes