ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization
This work addresses drug optimization for combating fast-mutating viruses and drug-resistant cancers, presenting an incremental improvement with a novel scaffold-based approach.
The paper tackles drug optimization by introducing ScaffoldGPT, a GPT model that uses molecular scaffolds to retain beneficial properties while enhancing desired attributes, demonstrating improved performance on COVID and cancer benchmarks compared to baselines.
Drug optimization has become increasingly crucial in light of fast-mutating virus strains and drug-resistant cancer cells. Nevertheless, it remains challenging as it necessitates retaining the beneficial properties of the original drug while simultaneously enhancing desired attributes beyond its scope. In this work, we aim to tackle this challenge by introducing ScaffoldGPT, a novel Generative Pretrained Transformer (GPT) designed for drug optimization based on molecular scaffolds. Our work comprises three key components: (1) A three-stage drug optimization approach that integrates pretraining, finetuning, and decoding optimization. (2) A novel two-phase incremental pre-training strategy for scaffold-based drug optimization. (3) A token-level decoding optimization strategy, Top-N, that enabling controlled, reward-guided generation using the pretrained or finetuned GPT. We demonstrate via a comprehensive evaluation on COVID and cancer benchmarks that ScaffoldGPT outperforms the competing baselines in drug optimization benchmarks, while excelling in preserving original functional scaffold and enhancing desired properties.