De Novo Generation of Hit-like Molecules from Gene Expression Profiles via Deep Learning
This addresses the problem of inefficient molecule generation in drug discovery for pharmaceutical researchers, offering a novel approach but with incremental improvements over existing methods.
The study tackled the challenge of generating hit-like molecules for drug discovery by proposing HNN2Mol, a hybrid neural network that uses gene expression profiles to create molecular structures with desirable phenotypes for target proteins, resulting in new molecules with potential bioactivities and drug-like properties.
De novo generation of hit-like molecules is a challenging task in the drug discovery process. Most methods in previous studies learn the semantics and syntax of molecular structures by analyzing molecular graphs or simplified molecular input line entry system (SMILES) strings; however, they do not take into account the drug responses of the biological systems consisting of genes and proteins. In this study we propose a hybrid neural network, HNN2Mol, which utilizes gene expression profiles to generate molecular structures with desirable phenotypes for arbitrary target proteins. In the algorithm, a variational autoencoder is employed as a feature extractor to learn the latent feature distribution of the gene expression profiles. Then, a long short-term memory is leveraged as the chemical generator to produce syntactically valid SMILES strings that satisfy the feature conditions of the gene expression profile extracted by the feature extractor. Experimental results and case studies demonstrate that the proposed HNN2Mol model can produce new molecules with potential bioactivities and drug-like properties.