CHEM-PHLGQMJun 22, 2023

Beyond Chemical Language: A Multimodal Approach to Enhance Molecular Property Prediction

arXiv:2306.14919v16 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses the problem of accurate molecular property prediction for researchers in chemistry and drug discovery, representing an incremental improvement by integrating multimodal data.

The authors tackled molecular property prediction by combining chemical language embeddings with causal physicochemical features, achieving superior performance over state-of-the-art methods like MOLFORMER and graph neural networks in tasks such as biodegradability and PFAS toxicity estimation.

We present a novel multimodal language model approach for predicting molecular properties by combining chemical language representation with physicochemical features. Our approach, MULTIMODAL-MOLFORMER, utilizes a causal multistage feature selection method that identifies physicochemical features based on their direct causal effect on a specific target property. These causal features are then integrated with the vector space generated by molecular embeddings from MOLFORMER. In particular, we employ Mordred descriptors as physicochemical features and identify the Markov blanket of the target property, which theoretically contains the most relevant features for accurate prediction. Our results demonstrate a superior performance of our proposed approach compared to existing state-of-the-art algorithms, including the chemical language-based MOLFORMER and graph neural networks, in predicting complex tasks such as biodegradability and PFAS toxicity estimation. Moreover, we demonstrate the effectiveness of our feature selection method in reducing the dimensionality of the Mordred feature space while maintaining or improving the model's performance. Our approach opens up promising avenues for future research in molecular property prediction by harnessing the synergistic potential of both chemical language and physicochemical features, leading to enhanced performance and advancements in the field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes