LGMLOct 9, 2020

Gini in a Bottleneck: Sparse Molecular Representations for Graph Convolutional Neural Networks

arXiv:2010.04535v31 citations
AI Analysis

This work addresses interpretability challenges for chemists using deep learning models on molecular graphs, though it is incremental as it builds on existing graph convolutional methods with a sparsity constraint.

The paper tackled the problem of interpretability in graph convolutional neural networks for molecular property prediction by applying a Gini index constraint to sparsify learned representations, showing that this constraint did not degrade evaluation metrics for some targets and enabled visually interpretable outputs.

Due to the nature of deep learning approaches, it is inherently difficult to understand which aspects of a molecular graph drive the predictions of the network. As a mitigation strategy, we constrain certain weights in a multi-task graph convolutional neural network according to the Gini index to maximize the "inequality" of the learned representations. We show that this constraint does not degrade evaluation metrics for some targets, and allows us to combine the outputs of the graph convolutional operation in a visually interpretable way. We then perform a proof-of-concept experiment on quantum chemistry targets on the public QM9 dataset, and a larger experiment on ADMET targets on proprietary drug-like molecules. Since a benchmark of explainability in the latter case is difficult, we informally surveyed medicinal chemists within our organization to check for agreement between regions of the molecule they and the model identified as relevant to the properties in question.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes