Henry van den Bedem

RO
3papers
27citations
Novelty60%
AI Score26

3 Papers

QMOct 19, 2022
An efficient graph generative model for navigating ultra-large combinatorial synthesis libraries

Aryan Pedawi, Pawel Gniewek, Chaoyi Chang et al.

Virtual, make-on-demand chemical libraries have transformed early-stage drug discovery by unlocking vast, synthetically accessible regions of chemical space. Recent years have witnessed rapid growth in these libraries from millions to trillions of compounds, hiding undiscovered, potent hits for a variety of therapeutic targets. However, they are quickly approaching a size beyond that which permits explicit enumeration, presenting new challenges for virtual screening. To overcome these challenges, we propose the Combinatorial Synthesis Library Variational Auto-Encoder (CSLVAE). The proposed generative model represents such libraries as a differentiable, hierarchically-organized database. Given a compound from the library, the molecular encoder constructs a query for retrieval, which is utilized by the molecular decoder to reconstruct the compound by first decoding its chemical reaction and subsequently decoding its reactants. Our design minimizes autoregression in the decoder, facilitating the generation of large, valid molecular graphs. Our method performs fast and parallel batch inference for ultra-large synthesis libraries, enabling a number of important applications in early-stage drug discovery. Compounds proposed by our method are guaranteed to be in the library, and thus synthetically and cost-effectively accessible. Importantly, CSLVAE can encode out-of-library compounds and search for in-library analogues. In experiments, we demonstrate the capabilities of the proposed method in the navigation of massive combinatorial synthesis libraries.

BMJul 14, 2020
Sequence-guided protein structure determination using graph convolutional and recurrent networks

Po-Nan Li, Saulo H. P. de Oliveira, Soichi Wakatsuki et al.

Single particle, cryogenic electron microscopy (cryo-EM) experiments now routinely produce high-resolution data for large proteins and their complexes. Building an atomic model into a cryo-EM density map is challenging, particularly when no structure for the target protein is known a priori. Existing protocols for this type of task often rely on significant human intervention and can take hours to many days to produce an output. Here, we present a fully automated, template-free model building approach that is based entirely on neural networks. We use a graph convolutional network (GCN) to generate an embedding from a set of rotamer-based amino acid identities and candidate 3-dimensional C$α$ locations. Starting from this embedding, we use a bidirectional long short-term memory (LSTM) module to order and label the candidate identities and atomic locations consistent with the input protein sequence to obtain a structural model. Our approach paves the way for determining protein structures from cryo-EM densities at a fraction of the time of existing approaches and without the need for human intervention.

ROJul 25, 2016
Collision-Free Poisson Motion Planning in Ultra High-Dimensional Molecular Conformation Spaces

Rasmus Fonseca, Dominik Budday, Henry van den Bedem

The function of protein, RNA, and DNA is modulated by fast, dynamic exchanges between three-dimensional conformations. Conformational sampling of biomolecules with exact and nullspace inverse kinematics, using rotatable bonds as revolute joints and non-covalent interactions as holonomic constraints, can accurately characterize these native ensembles. However, sampling biomolecules remains challenging owing to their ultra-high dimensional configuration spaces, and the requirement to avoid (self-) collisions, which results in low acceptance rates. Here, we present two novel mechanisms to overcome these limitations. First, we introduced temporary constraints between near-colliding links. The resulting constraint varieties instantaneously redirect the search for collision-free conformations, and couple motions between distant parts of the linkage. Second, we adapted a randomized Poisson-disk motion planner, which prevents local oversampling and widens the search, to ultra-high dimensions. We evaluated our algorithm on several model systems. Our contributions apply to general high-dimensional motion planning problems in static and dynamic environments with obstacles.