Liangren Zhang

QM
h-index33
6papers
539citations
Novelty57%
AI Score36

6 Papers

LGMar 13, 2023
Bridging the Gap between Chemical Reaction Pretraining and Conditional Molecule Generation with a Unified Model

Bo Qiang, Yiran Zhou, Yuheng Ding et al.

Chemical reactions are the fundamental building blocks of drug design and organic chemistry research. In recent years, there has been a growing need for a large-scale deep-learning framework that can efficiently capture the basic rules of chemical reactions. In this paper, we have proposed a unified framework that addresses both the reaction representation learning and molecule generation tasks, which allows for a more holistic approach. Inspired by the organic chemistry mechanism, we develop a novel pretraining framework that enables us to incorporate inductive biases into the model. Our framework achieves state-of-the-art results on challenging downstream tasks. By possessing chemical knowledge, our generative framework overcome the limitations of current molecule generation models that rely on a small number of reaction templates. In the extensive experiments, our model generates synthesizable drug-like structures of high quality. Overall, our work presents a significant step toward a large-scale deep-learning framework for a variety of reaction-based applications.

QMDec 24, 2019Code
TF3P: Three-dimensional Force Fields Fingerprint Learned by Deep Capsular Network

Yanxing Wang, Jianxing Hu, Junyong Lai et al.

Molecular fingerprints are the workhorse in ligand-based drug discovery. In recent years, an increasing number of research papers reported fascinating results on using deep neural networks to learn 2D molecular representations as fingerprints. It is anticipated that the integration of deep learning would also contribute to the prosperity of 3D fingerprints. Here, we unprecedentedly introduce deep learning into 3D small molecule fingerprints, presenting a new one we termed as the three-dimensional force fields fingerprint (TF3P). TF3P is learned by a deep capsular network whose training is in no need of labeled datasets for specific predictive tasks. TF3P can encode the 3D force fields information of molecules and demonstrates the stronger ability to capture 3D structural changes, to recognize molecules alike in 3D but not in 2D, and to identify similar targets inaccessible by other 2D or 3D fingerprints based on only ligands similarity. Furthermore, TF3P is compatible with both statistical models (e.g. similarity ensemble approach) and machine learning models. Altogether, we report TF3P as a new 3D small molecule fingerprint with a promising future in ligand-based drug discovery. All codes are written in Python and available at https://github.com/canisw/tf3p.

BMApr 10, 2024
Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation

Ningfeng Liu, Jie Yu, Siyu Xiu et al.

Molecular generation, an essential method for identifying new drug structures, has been supported by advancements in machine learning and computational technology. However, challenges remain in multi-objective generation, model adaptability, and practical application in drug discovery. In this study, we developed a versatile 'plug-in' molecular generation model that incorporates multiple objectives related to target affinity, drug-likeness, and synthesizability, facilitating its application in various drug development contexts. We improved the Particle Swarm Optimization (PSO) in the context of drug discoveries, and identified PSO-ENP as the optimal variant for multi-objective molecular generation and optimization through comparative experiments. The model also incorporates a novel target-ligand affinity predictor, enhancing the model's utility by supporting three-dimensional information and improving synthetic feasibility. Case studies focused on generating and optimizing drug-like big marine natural products were performed, underscoring PSO-ENP's effectiveness and demonstrating its considerable potential for practical drug discovery applications.

QMMar 22, 2025
NaFM: Pre-training a Foundation Model for Small-Molecule Natural Products

Yuheng Ding, Bo Qiang, Yiran Zhou et al.

Natural products, as metabolites from microorganisms, animals, or plants, exhibit diverse biological activities, making them crucial for drug discovery. Nowadays, existing deep learning methods for natural products research primarily rely on supervised learning approaches designed for specific downstream tasks. However, such one-model-for-a-task paradigm often lacks generalizability and leaves significant room for performance improvement. Additionally, existing molecular characterization methods are not well-suited for the unique tasks associated with natural products. To address these limitations, we have pre-trained a foundation model for natural products based on their unique properties. Our approach employs a novel pretraining strategy that is especially tailored to natural products. By incorporating contrastive learning and masked graph learning objectives, we emphasize evolutional information from molecular scaffolds while capturing side-chain information. Our framework achieves state-of-the-art (SOTA) results in various downstream tasks related to natural product mining and drug discovery. We first compare taxonomy classification with synthesized molecule-focused baselines to demonstrate that current models are inadequate for understanding natural synthesis. Furthermore, by diving into a fine-grained analysis at both the gene and microbial levels, NaFM demonstrates the ability to capture evolutionary information. Eventually, our method is experimented with virtual screening, illustrating informative natural product representations that can lead to more effective identification of potential drug candidates.

QMAug 20, 2019
DeepScaffold: a comprehensive tool for scaffold-based de novo drug discovery using deep learning

Yibo Li, Jianxing Hu, Yanxing Wang et al.

The ultimate goal of drug design is to find novel compounds with desirable pharmacological properties. Designing molecules retaining particular scaffolds as the core structures of the molecules is one of the efficient ways to obtain potential drug candidates with desirable properties. We proposed a scaffold-based molecular generative model for scaffold-based drug discovery, which performs molecule generation based on a wide spectrum of scaffold definitions, including BM-scaffolds, cyclic skeletons, as well as scaffolds with specifications on side-chain properties. The model can generalize the learned chemical rules of adding atoms and bonds to a given scaffold. Furthermore, the generated compounds were evaluated by molecular docking in DRD2 targets and the results demonstrated that this approach can be effectively applied to solve several drug design problems, including the generation of compounds containing a given scaffold and de novo drug design of potential drug candidates with specific docking scores. Finally, a command line interface is created.

QMJan 18, 2018
Multi-Objective De Novo Drug Design with Conditional Graph Generative Model

Yibo Li, Liangren Zhang, Zhenming Liu

Recently, deep generative models have revealed itself as a promising way of performing de novo molecule design. However, previous research has focused mainly on generating SMILES strings instead of molecular graphs. Although current graph generative models are available, they are often too general and computationally expensive, which restricts their application to molecules with small sizes. In this work, a new de novo molecular design framework is proposed based on a type sequential graph generators that do not use atom level recurrent units. Compared with previous graph generative models, the proposed method is much more tuned for molecule generation and have been scaled up to cover significantly larger molecules in the ChEMBL database. It is shown that the graph-based model outperforms SMILES based models in a variety of metrics, especially in the rate of valid outputs. For the application of drug design tasks, conditional graph generative model is employed. This method offers higher flexibility compared to previous fine-tuning based approach and is suitable for generation based on multiple objectives. This approach is applied to solve several drug design problems, including the generation of compounds containing a given scaffold, generation of compounds with specific drug-likeness and synthetic accessibility requirements, as well as generating dual inhibitors against JNK3 and GSK3$β$. Results show high enrichment rates for outputs satisfying the given requirements.