QMAILGMNJan 29, 2024

Unveiling Molecular Moieties through Hierarchical Grad-CAM Graph Explainability

arXiv:2402.01744v53 citationsh-index: 16BMC Bioinformatics
AI Analysis

This work addresses the need for explainable AI in drug discovery, offering a tool to aid computational chemists in optimizing molecular structures and repurposing drugs, though it is incremental as it builds on existing Grad-CAM methods.

The researchers tackled the challenge of interpreting Graph Neural Networks (GNNs) in drug discovery by developing the Hierarchical Grad-CAM graph Explainer (HGE) framework, which identified key molecular substructures for protein-ligand binding with validation against experimental data, achieving state-of-the-art performance in virtual screening tasks.

Background: Virtual Screening (VS) has become an essential tool in drug discovery, enabling the rapid and cost-effective identification of potential bioactive molecules. Among recent advancements, Graph Neural Networks (GNNs) have gained prominence for their ability to model complex molecular structures using graph-based representations. However, the integration of explainable methods to elucidate the specific contributions of molecular substructures to biological activity remains a significant challenge. This limitation hampers both the interpretability of predictive models and the rational design of novel therapeutics. Results: We trained 20 GNN models on a dataset of small molecules with the goal of predicting their activity on 20 distinct protein targets from the Kinase family. These classifiers achieved state-of-the-art performance in virtual screening tasks, demonstrating high accuracy and robustness on different targets. Building upon these models, we implemented the Hierarchical Grad-CAM graph Explainer (HGE) framework, enabling an in-depth analysis of the molecular moieties driving protein-ligand binding stabilization. HGE exploits Grad-CAM explanations at the atom, ring, and whole-molecule levels, leveraging the message-passing mechanism to highlight the most relevant chemical moieties. Validation against experimental data from the literature confirmed the ability of the explainer to recognize a molecular pattern of drugs and correctly annotate them to the known target. Conclusion: Our approach may represent a valid support to shorten both the screening and the hit discovery process. Detailed knowledge of the molecular substructures that play a role in the binding process can help the computational chemist to gain insights into the structure optimization, as well as in drug repurposing tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes