CLAIIRDec 8, 2023

Predictive Chemistry Augmented with Text Retrieval

arXiv:2312.04881v1136 citationsh-index: 98EMNLP
Originality Incremental advance
AI Analysis

This addresses the bottleneck of manually extracting structured data in chemoinformatics, offering a domain-specific advancement for chemistry researchers.

The paper tackles the problem of enhancing predictive chemistry models by introducing TextReact, a method that retrieves and aligns natural language descriptions from literature with molecular representations, achieving significant performance improvements over state-of-the-art models on reaction condition recommendation and one-step retrosynthesis tasks.

This paper focuses on using natural language descriptions to enhance predictive models in the chemistry field. Conventionally, chemoinformatics models are trained with extensive structured data manually extracted from the literature. In this paper, we introduce TextReact, a novel method that directly augments predictive chemistry with texts retrieved from the literature. TextReact retrieves text descriptions relevant for a given chemical reaction, and then aligns them with the molecular representation of the reaction. This alignment is enhanced via an auxiliary masked LM objective incorporated in the predictor training. We empirically validate the framework on two chemistry tasks: reaction condition recommendation and one-step retrosynthesis. By leveraging text retrieval, TextReact significantly outperforms state-of-the-art chemoinformatics models trained solely on molecular data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes