BMAug 17, 2024Code
Fragment-Masked Diffusion for Molecular OptimizationKun Li, Xiantao Cai, Jia Wu et al.
Molecular optimization is a crucial aspect of drug discovery, aimed at refining molecular structures to enhance drug efficacy and minimize side effects, ultimately accelerating the overall drug development process. Many molecular optimization methods have been proposed, significantly advancing drug discovery. These methods primarily on understanding the specific drug target structures or their hypothesized roles in combating diseases. However, challenges such as a limited number of available targets and a difficulty capturing clear structures hinder innovative drug development. In contrast, phenotypic drug discovery (PDD) does not depend on clear target structures and can identify hits with novel and unbiased polypharmacology signatures. As a result, PDD-based molecular optimization can reduce potential safety risks while optimizing phenotypic activity, thereby increasing the likelihood of clinical success. Therefore, we propose a fragment-masked molecular optimization method based on PDD (FMOP). FMOP employs a regression-free diffusion model to conditionally optimize the molecular masked regions, effectively generating new molecules with similar scaffolds. On the large-scale drug response dataset GDSCv2, we optimize the potential molecules across all 985 cell lines. The overall experiments demonstrate that the in-silico optimization success rate reaches 95.4\%, with an average efficacy increase of 7.5\%. Additionally, we conduct extensive ablation and visualization experiments, confirming that FMOP is an effective and robust molecular optimization method. The code is available at: https://anonymous.4open.science/r/FMOP-98C2.
BMOct 23, 2022
Towards a Better Model with Dual Transformer for Drug Response PredictionKun Li, Jia Wu, Bo Du et al.
GNN-based methods have achieved excellent results as a mainstream task in drug response prediction tasks in recent years. Traditional GNN methods use only the atoms in a drug molecule as nodes to obtain the representation of the molecular graph through node information passing, whereas the method using the transformer can only extract information about the nodes. However, the covalent bonding and chirality of a drug molecule have a great influence on the pharmacological properties of the molecule, and these information are implied in the chemical bonds formed by the edges between the atoms. In addition, CNN methods for modelling cell lines genomics sequences can only perceive local rather than global information about the sequence. In order to solve the above problems, we propose the decoupled dual transformer structure with edge embedded for drug respond prediction (TransEDRP), which is used for the representation of cell line genomics and drug respectively. For the drug branch, we encoded the chemical bond information within the molecule as the embedding of the edge in the molecular graph, extracted the global structural and biochemical information of the drug molecule using graph transformer. For the branch of cell lines genomics, we use the multi-headed attention mechanism to globally represent the genomics sequence. Finally, the drug and genomics branches are fused to predict IC50 values through the transformer layer and the fully connected layer, which two branches are different modalities. Extensive experiments have shown that our method is better than the current mainstream approach in all evaluation indicators.