LGJun 13, 2023Code
Reinforcement Learning-Driven Linker Design via Fast Attention-based Point Cloud AlignmentRebecca M. Neeser, Mehmet Akdel, Daniel Kovtun et al.
Proteolysis-Targeting Chimeras (PROTACs) represent a novel class of small molecules which are designed to act as a bridge between an E3 ligase and a disease-relevant protein, thereby promoting its subsequent degradation. PROTACs are composed of two protein binding "active" domains, linked by a "linker" domain. The design of the linker domain is challenging due to geometric and chemical constraints given by its interactions, and the need to maximize drug-likeness. To tackle these challenges, we introduce ShapeLinker, a method for de novo design of linkers. It performs fragment-linking using reinforcement learning on an autoregressive SMILES generator. The method optimizes for a composite score combining relevant physicochemical properties and a novel, attention-based point cloud alignment score. This new method successfully generates linkers that satisfy both relevant 2D and 3D requirements, and achieves state-of-the-art results in producing novel linkers assuming a target linker conformation. This allows for more rational and efficient PROTAC design and optimization. Code and data are available at https://github.com/aivant/ShapeLinker.
LGDec 20, 2023
FSscore: A Machine Learning-based Synthetic Feasibility Score Leveraging Human ExpertiseRebecca M. Neeser, Bruno Correia, Philippe Schwaller
Determining whether a molecule can be synthesized is crucial in chemistry and drug discovery, as it guides experimental prioritization and molecule ranking in de novo design tasks. Existing scoring approaches to assess synthetic feasibility struggle to extrapolate to new chemical spaces or fail to discriminate based on subtle differences such as chirality. This work addresses these limitations by introducing the Focused Synthesizability score~(FSscore), which uses machine learning to rank structures based on their relative ease of synthesis. First, a baseline trained on an extensive set of reactant-product pairs is established, which is then refined with expert human feedback tailored to specific chemical spaces. This targeted fine-tuning improves performance on these chemical scopes, enabling more accurate differentiation between molecules that are hard and easy to synthesize. The FSscore showcases how a human-in-the-loop framework can be utilized to optimize the assessment of synthetic feasibility for various chemical applications.