LGCOMP-PHMLFeb 19, 2020

Molecule Attention Transformer

arXiv:2002.08264v10.00199 citations
AI Analysis50

This addresses the problem of enabling widespread deep learning use in drug discovery by providing a versatile architecture for molecule prediction tasks, though it appears incremental as it builds on existing Transformer methods.

The paper tackles the challenge of designing a single neural network architecture for diverse molecule property prediction tasks by proposing the Molecule Attention Transformer (MAT), which augments Transformer attention with inter-atomic distances and molecular graph structure, achieving competitive performance and state-of-the-art results with simple self-supervised pretraining.

Designing a single neural network architecture that performs competitively across a range of molecule property prediction tasks remains largely an open challenge, and its solution may unlock a widespread use of deep learning in the drug discovery industry. To move towards this goal, we propose Molecule Attention Transformer (MAT). Our key innovation is to augment the attention mechanism in Transformer using inter-atomic distances and the molecular graph structure. Experiments show that MAT performs competitively on a diverse set of molecular prediction tasks. Most importantly, with a simple self-supervised pretraining, MAT requires tuning of only a few hyperparameter values to achieve state-of-the-art performance on downstream tasks. Finally, we show that attention weights learned by MAT are interpretable from the chemical point of view.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes