LGCLQMApr 5, 2024

Transformers for molecular property prediction: Lessons learned from the past five years

arXiv:2404.03969v141 citationsh-index: 23J Chem Inf Model
Originality Synthesis-oriented
AI Analysis

It provides a synthesis for researchers in drug discovery and related fields, but is incremental as it reviews existing work without introducing novel methods.

This review analyzes the use of transformer models for molecular property prediction, identifying key challenges in training and comparison, but does not report specific numerical results or new findings.

Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pre-training data, optimal architecture selections, and promising pre-training objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes