David Pekker

2papers

2 Papers

3.6BMApr 15
Polyformer: a generative framework for thermodynamic modeling of polymeric molecules

Alessio Valentini, David Pekker, Chungwen Liang et al.

The classic paradigm of structural biology is that the sequence of a biomolecule (protein, nucleic acid, lipid, etc) determines its conformation (shape) which determines its biological function. Protein folding programs like AlphaFold address this paradigm by predicting the single best conformation given a sequence that defines the molecule. However, biomolecules are not static structures, and their conformational ensemble determines their function. We present the Polyformer -- a generative framework for thermodynamic modeling of polymeric molecules. Given the sequence and temperature (or another thermodynamic variable), the Polyformer generates conformations faithful to the molecule's thermodynamic conformational ensemble. It is the first generative model that solves three problems simultaneously: how does a molecule fold, what is its conformational ensemble, and how does the conformational ensemble change as we change physical temperature. As a concrete test case, we apply Polyformer to protein domains with 50-111 residues and report good agreement of model predictions to Molecular Dynamics (MD) trajectories.

CHEM-PHAug 9, 2022
Machine Learning 1- and 2-electron reduced density matrices of polymeric molecules

David Pekker, Chungwen Liang, Sankha Pattanayak et al.

Encoding the electronic structure of molecules using 2-electron reduced density matrices (2RDMs) as opposed to many-body wave functions has been a decades-long quest as the 2RDM contains sufficient information to compute the exact molecular energy but requires only polynomial storage. We focus on linear polymers with varying conformations and numbers of monomers and show that we can use machine learning to predict both the 1-electron and the 2-electron reduced density matrices. Moreover, by applying the Hamiltonian operator to the predicted reduced density matrices we show that we can recover the molecular energy. Thus, we demonstrate the feasibility of a machine learning approach to predicting electronic structure that is generalizable both to new conformations as well as new molecules. At the same time our work circumvents the N-representability problem that has stymied the adaption of 2RDM methods, by directly machine-learning valid Reduced Density Matrices.